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INTERCORRELATIONS AND FACTOR ANAL- 
YSIS OF TESTS GIVEN TO TEACH- 
ING CANDIDATES 


J. C. GOWAN г 
University of California at Los Angeles 


THE STUDY reported below grew out of a 
larger program of assessment and evaluation of 
teaching candidates reported elsewhere (7), sup- 
plemented by smaller samples from other teach- 
er training institutions. It is the purpose of the 
present report to point up certain interrelation- 
ships between testing instruments which may 
have significance beyond the context in whic h 
the tests were given. 


Sample One 


1. Scope of Study 


This study gives intercorrelations and result- 
ant factor analyses from a matrix of o rd er 62 
produced by the scores of teaching candidates on 
a battery of tests. In addition tothelarge num- 
ber of scales making up the matrix, the study is 
of interest because of the sizeable numbers of 
subjects involved. These subjects were junior, 
senior and graduate students at the University of 
California, Los Angeles. The tests adm inis- 
tered were part of a required series given by 
the Teacher Selection and Counseling Service of 
the School of Education. This agency processes 
about 1400 cases per year. | 

It is not the purpose of this paper either to 
explore the literature, or engage in discussion 
about.criteria of teaching success, or develop 
teacher prognosis scales. These matters are 
handled elsewhere (1, 4, 5, 6, 9, 10). The design 
is simply to detail as briefly as possible the 
rather extensive intercorrelations andthe factor 
analysis results accruing therefrom. 


2. Tests Used and Method of Procedure 
4, Tests ОБЕ а 


The tests used in this study were the Cooper- 


ative English Test, the Stanford Arithmetic Test, 
the American Council Psychological Examination, 
the Minnesota Multiphasic Personality Inventory, 
the California Psychological Inventory, the Re- 
vised Study of Values (Allport etal.), the Guilford- 
Zimmerman Temperament Survey, and two new 
scales on the MMPI alleged to predict teaching 
success or failure. The teaching scales were 
the plus and minus sections of a scale devised by 
the authors and detailed more fully elsewhere (5). 
The three thousand odd correlation coefficients 
needed for the study were obtained from IBM 
cards by a method outlined by J. C. Flanagan (2). 
Briefly, this method consists of determining what 
percent of the top 27 percent criterion group on 
the independent variable lie above the median 
score on the dependent variable, and what percent 
of the vottom 27 percent criterion group on the in- 
dependent variable behave in similar fashion. 
The names of the scales of the various tests, 
their code symbols, and the variable numbers 
are detailed in Table I. It will be observed that 
the scale codes are grouped so that all scales for 
a certain test have a common initial letter. Care 
has been taken so that the scale numbers corre- 
spond (as in the MMPI) to commonly accepted 
usage. Validating scales have smallletters. A 
few of the scales were not used throughout the 
study. For example, D5, the Masculinity- Fem- 
ininity scale of the MMPI, was not used because 
the scoring differs for men and women. No at- 
tempt was made to correlate scales D11, D12, 
D13 or D14 with any of the later scales, since 
Gough used these scales to develop corresponding 
scales (E5, El, E4, E10) on the Psychological In- 
ventory. The intercorrelations of the A group 
with groups beyond D are also missing, since a 
factor analysis showed that practically all the 
variance of this group was being cared for by the 
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NAMES AND CODE SYMBOLS FOR SCALES USED IN INTERCORRELATION STUDY I 


Name of Scale 


Code Name oí Scale 


Cooperative and Stanford 
Cooperative English Vocabulary 
Cooperative English Speed 
Cooperative English Level 
Cooperative English Mechanics 
Cooperative English Effectiveness 
Stanford Arithmetic (Part 1) 


American Council Psychological 
Q: Quantitative 

L; Linguistic 

E Total 


Allport Study of Values 
T: Theoretical 
Economic 
Aesthetic 

Social 

Political 
Religious 


s mmn 


innesota Multiphasic Inventory 
+ Lie 

Falsification 
Suppressor Variable 

S: Hypochondriasis 
Depression 

Hy: Hysteria 

Pd: Psychopathic Deviate 
Pa: Paranoia 

Pt: Psychasthenia 

Se: Schizophrenia 

Ma: Hypomamia 

Si: Social Introversion 

Do: Dominance 

Re: Social Responsibility 
St: Status 

Ac: Academic Achievement 


рш тч 


Е California Psychological Inventory 
Ea Im Шїгециепсу 

Eb Gi: Good Impression 

Ec Ds: Dissimulation 

E1 Re: Social Responsibility 
E2 To: Tolerance 

E3 Fl: Flexibility 

E4 St Status 

E5 Do: Dominance 

E6 Sp: Social Participation 

E7 Fe: Femininity 

E8 De: Delinquency 

E9 Ie: Intellectual Efficiency 
E10 Ac: Academic Achievement 
Ell Hr: Honor Point Ratio 

E12 Py: Psychologist’s Interest 
E13 Ne: Neurodermatitis 

E14 Xl: Poise, Self-confidence 
E15 X2: Impulsivity 


Е  Guilford- Zimmerman Temperament 
Fl G: General Energy 

F2 R: Restraint 

F3 A: Ascendance 

F4 s: Sociability 

F5 E: Emotional Stability 
F6 O: Objectivity 

ЕТ Е Friendliness 

F8 1 Thoughtfulness 

F9 P: Personal Relations 
F10 M: Masculinity 


G Teacher Prognosis Scales (MMPI) 
Gl Tp: Teacher Positive 
G2 Tn: Teacher Negative 


GOWAN 3 


B group (ACE scores). 

It was not considered feasible to perform a 
factor analysis of the entire matrix, so various 
minors were selected for the purpose. In order 
to present the matrix in form which can be ac- 
commodated on paper of standard size, it was 
split up into various minors. A schematic pre- 
sentation of the matrix with respect to this sec- 
tioning is shown in Tables П and Ш. Table II 
shows the number of students involved in each 
section, from which it will be seen that far fewer 
cases were covered in the last threegroups. Ta- 
ble III indicates what parts of the master matrix 
are displayed in future tables indicating specific 
minors of the determinant. 

Tables IV to VII inclusive present various 
minors of this matrix. Tables VIII to XI, inclu- 
sive, present factor analyses resulting from this 
material. The remaining tables in the paper 
concern different samples of a much smaller 
magnitude. 


3. Results and Discussion 


The results so far as intercorrelations inthe 
Tables IV to VII are concerned, speak for them- 
selves. It is not considered feasible to discuss 
all the implications raised. Such material, how- 
ever, may be valuable to investigators other than 
those in education, and are presented with this 
in mind. 

An interesting empirical check on the stand- 
ard error of measurement for the shortcut meth- 
od of obtaining correlation coefficients appears 
worthy of comment. The method utilized (Flan- 
agan's tails method) results in an ajj which may 
be different from аң. In Minor I (Table IV) the 
difference (aij - aji) was computed for the 870 
coefficients in the 30 x 30 matrix. After elim- 
inating and correcting errors signaled by high 
differences, the following distribution was ob- 


tained: 


Difference 
Interval Frequency 
20to 22 1 
17to 19 1 
14to 16 5 
11to 13 5 
8to 10 12 
5to T 38 N = 435 
2to 4 76 M = 0.06 
- 140 1 141 S.D. = 4.92 
- 4to- 2 94 
-Tto- 5 41 
-10 to - 8 16 
-13 to -11 4 
-16 to -14 1 


The mean of this distribution was gratifyingly 
near zero; the standard deviation was 4.92. The 


standard error of an “r” of zero for an N of 
1700 is 2. 42 by the usual formula for standard er- 
ror of Pearsonian ‘г’. In practice the differ- 
ences were averaged. It may be noted that Flan- 
agan (2:347) gives an approximation for what 
amounts to the standard error of a correlation co- 
efficient obtained in the above manner. In the 
case, “r” equals 0, it is 1.3 times the standard 
error for the Pearsonian coefficient; for an ‘‘r’’ 
of .45, itis 1. 5 times as much. The maximum 
standard error values given in thetables are for 
Pearsonian ‘‘r’s’’. These facts should be taken 
into account in interpreting the tables. 


4. Factor Analysis 


Factor analysis of various selected minors of 
the material in Sample One was done by centroid 
methods outlined by Thurstone (11). Results for 
a 30x 30 matrix of variables 1 through 30 are 
shown in Table VIII. This matrix consisted of 
variables of the Cooperative English Test, Amer- 
ican Council Psychological Examination, Allport 
Study of Values, and Minnesota Multiphasic Inven- 
tory. Six factors were extracted. These were 
sufficient to account for more than half the vari- 
ance except on the following scales: all scales of 
the Allport; and Lie, Falsification, Psychopath- 
ic Deviate, Paranoia, Hypomania and Status of 
the MMPI. The factors were left unrotated, at 
least in this initial study, since some of the fac- 
tors (such as factor I, which obviously repre- 
sents general intelligence) had considerable psy- 
chological significance as they stood. Some ro- 
tation was attempted later, as shown in Table XI. 
The unrotated factors of factor analysis I were 
designated as follows: 


Factor I, with its very high loading on the A. 
C.E. (practically to the reliability coefficient) 
and with somewhat less high loadings on the C o- 
operative English Test, was named **Intelligence'? 
As can be seen, it is considerably more verbal 
than numerical. The only other loading above .20 
on this factor is positive with Theoretical and 
Dominance, and negative with Lie scales. This 
one factor takes out so much of the variance of 
the first nine variables that only in Factor V is 
there found a single loading of . 15 ог more. The 
A.C.E. Total effectively speaks for the other 
variables. 

Factor П is rather well defined as “K”, the 
somewhat mysterious suppressor variable onthe 
MMPI. Because of the fact that either fractions 
or the full amount of the **K"' value is added to 
the Hs, Pd, Pt, Sc and Ma scales, care should 
be used in concluding that correlations with these 
Scales fix the description of “К”. Of the ‘‘un- 
contaminated scales’’ the order of factor load- 
ings is as follows: Hysteria, Responsibility, 
Paranoia, Lie, Dominance and Status. Moder- 
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TABLE III 


SCHEMATIC PRESENTATION OF THE MATRIX SHOWING MINORS EXHIBITED 


Code and Test A B ç D & E F G 


A Cooperative 
English and 
Stanford 


Minor I 
Table IV 


B American 


Council 
Psychological 

C  Allport- Minor II 
Vernon Study Table V 
of Values 


D Minnesota 
Multiphasic 
Inventory 


Minor HI 
Table VI 


E California 
Psychological 
Inventory 


Minor IV 
Table VII 


F Guilford- 
Zimmerman 
Temperament 


G Teacher 
Prognosis 
Scales (MMPI) | 
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TABLE IV 
(Minor I) 


INTERCORRELATIONS BETWEEN COOPERATIVE ENGLISH, AMERICAN COUNCIL PSYCHOLOGIC AL, 
ALLPORT STUDY OF VALUES AND MINNESOTA MULTIPHASIC INVENTORY* 


Cooperative et al AQ OB. Allport Values 

Al А2 АЗ A4 АБ Аб Bi B2 B3 Ci С? СЗ C4 Ch C6 
Code Voc Sp Lev M Eff SA Q L T T E A S P R 
AN. BTOB. 53 5% 20 C8; "б S5. а aa б "dé йй 4 
A2 = 78 53 62 42 48 78 "i 16 -21 22 00  -05  -14 
A3 -- 46 56 34 à 34 85 60 18  .-20 18 бп «05 1 
A4 -- 60 37 36 56 56 01 -15 17 10. f3 «08 
A5 zs 98% 39 61 ‘61 10 “1 14 00 -05  -08 
A6 ss 388 AG бї 17 @ -16 о oo  -04 
Bl -- 48 78 09 00 -04 фо от 0 
B2 = :82.- 2 м 22 OW 08 s 
B3 == 20 -16 бї -04 -14 -? 
ci DE NE ы йй 246 
ЕЁ sa б apg 30. “l 
bs n 25 
C4 ое. 
C5 zc #80 
C6 ze 
Da i 
Db 
Dc 
Di 
D2 
D3 
D4 
D6 
рт 
рв 
D9 
D10 
D11 
D12 


*Decimals omitted throughout 
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Minnesota Multiphasic Inventory 


Da Db Dc D1 D2 D3 D4 D6 D7 D8 D9 D10 D11 Di2 D13 
L F K Hs D Hy Pd Pa Pt Sc Ma Si Do Re St 
16 00 -02 -09 03 -04 01 -04  -05 03 -10 06 23 07 14 
23 -04 00 -08 -08 -10 -07 -07 -06 04 -05 03 17 05 13 
18 -04 02 -08 -06 -09 -05 -05 -07 01 -10 04 22 07 10 
06 -07 06 -06 -04 -02 -05 -07 -02 00 -04 07 15 13 13 
10 -05 04 -04 -07 00 -04 -02 -06 -02 -07 03 13 04 06 
12 -08 08 -04 -08 -06 -03 -04 -08 -02 -05 -02 10 02 02 
11 -07 04 -04 -14 -07 -08 -08 -10 -05 -01 -05 08 01 09 
20 -06 02 -10 -02 -08 -01 -06 -04 03 -04 00 24 14 18 
02 01 -10 -08 -08 -08 -05 -09 -08 -01 -01 00 20 04 16 
09 14 -02 -04 10 -03 08 -04 -06 01 -03 04 13 04 10 
06 -03 -05 -04 -05 -12 -11 -11 -08 -11 01 02 -08 -17 -09 
00 15 -09 -02 14 04 00 02 08 10 00 08 08 04 14 
09 -02 09 05 06 06 08 08 06 10 -02 01 05 10 04 
08 00 -03 -06 -07 -06 00 -07 -08 -07 09 -12 11 -10 08 
11 -22 06 07 -12 02 -04 09 06 -04 -02 -02 -23 10 -23 
== -08 46 24 03 33 11 17 00 00 -16 -12 03 47 08 
se -26 09 28 06 18 11 23 30 12 24 -05 -18 -09 

-- 57 -10 51 39 23 18 47 -16 -44 31 52 22 

-- 31 70 40 28 48 50 -02 -03 -03 12 -01 

-- 22 32 19 52 26 -22 38 -25 -12 -18 

-- 45 38 36 40 -06 -20 10 25 10 

-- 31 42 52 16 -11 04 07 06 

-- 36 36 -02 -05 06 10 00 

-- 64 08 20 -28 -08 -14 

-- 20 -02 00 07 -02 

-- -20 -04 -28 10 

-- -32 -18 -43 

-- 39 52 

== 19 
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TABLE VIII 
(Factor Analysis I) 


UNROTATED FACTOR LOADINGS FOR MATRIX OF VARIABLES 1 THROUGH 30* 


Independent Variable 


D9 

D10 
D11 
D12 
D13 


*Decimals are omitted throughout. 


Cooperative English Vocabulary 
Cooperative English Speed 
Cooperative English Level 
Cooperative English Mechanics 
Cooperative English Effectiveness 
Stanford Arithmetic 


Quantitative 
Linguistic 
Total 


Theoretical 
Economic 
Aesthetic 
Social 
Political 
Religious 


Pyk Aro 


Lie 

Falsification 
Suppressor Variable 
Hypochonriasis 
Depression 

Hysteria 
Psychopathic Deviate 
Pa: Paranoia 
Psychasthenia 
Schizophrenia 
Hypomania 

Social Introversion 
Dominance 

Social Responsibility 
Status 


vEm5mut 


vit 
aS 


13 
74 
56 
44 
39 
60 
00 
-35 
33 
47 
32 


56 
-51 
-33 
-49 


For further identification of 


variables, see Table I. 
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ate amounts of ‘‘K’’ may be desirable for inte- 
grated ego functioning. The factoris designated 
as ego-sensitivity. 

Factor III is pretty well described asa''hope- 
lessness'' indicator. It has its highest loadings 
on D, Pt and Si. There is social withdr awal 
here also, as indicated by the positive loadings 
on Sc and Si, and the negative loading on Do. 

Factor IV is an Allport factor designated here 
as ‘‘mystical’’ because of its high religious load- 
ing, but also because of the negative loadings on 
theoretical and dominance. Notice that persons 
described by this factor do not withdraw although 
they may not seek to lead, and that they do not 
falsify although they may tend to be subjective. 

Factor V represents the pole of verbal-artis- 
tic versus mathematical-practical. The guess 
is hazarded that this factor would show consider- 
able sex difference. The largest variances 
come on the Allport, but there are im portant 
loadings on Vocabulary and “Q”. 

Factor VI seems to be less well defined than 
the others. It appears to represent manic ten- 
dencies, especially those connected with not fol- 
lowing through on a job. Itis designated as 
“тапіс irresponsibility”. 

Factor Analysis II is detailedin Table IX 
From a matrix of 20 variables selected from the 
California Psychological Inventory and the Guil- 
ford- Zimmerman Temperament Survey, three 
factors were extracted into oblique simple struc- 
ture. The rotated factors might be designated 
aS: I, General Teaching Adjustment; II, Thought- 
fulness or anti-delinquency; and II, General En- 
ergy. The rather considerable loadings on Fac- 
tor I round out the description of what teaching 
adjustment as measured by the Teacher Progno- 
Sis Scale Gi represents. 

Factor Analysis III is an attempt to combine 
Some of the leading variables of I and II. Table 
X details the loadings of the unrotated and ro- 
tated three factors extracted from ten selected 
variables. Again Factor І Seems to be General 


ibility. Re 
this and the last factor analysis, it appears that 
the same factor Space is described. 


digression with regard to Fact 
The first three variables 


which is shown in Figure 1. 

the A. C. E. variable is at on 
the other two poles represen 
and anti-depression, 


expressed by a single parameter angle, represent- 
ing their deviation from an ideal pole *anti-de- 
pression". Rotation, hence, seems unnecessary. 
The concept of the different scales of the MMPI 
being spread out in this fashion introduces some 
very interesting speculations and possibiliti es 
which only further research can verify or dis- 
prove. These facts should be considered, of 
course, in the light of the amount of variance that 
any particular scale has with respect to these 
three factors. The fact that paranoia and psycho- 
pathic deviate are nearly together does not in di- 
cate of itself that they represent the same wn 
since only 23 percent of the variance of the firs 
and 41 percent of the variance of the second is 
involved. Nevertheless, the relationships be- 
tween the MMPI scales are certainly rather | 
graphically revealed in at least some of their di 
mensions by Figures 1 and 2. 


Samples Two and Three 
1. Introduction 


Samples Two and Three consist of much ae 
ler populations and far fewer test scales. Ton 
are included chiefly because of the confirma he 
of the resulting factor analyses with some ie goer 
factor space and location of the vectors er (that 
vious analyses. It is considered significan, вів 
different factor analyses done with differen imi- 
and on different populations should turn up 5 
lar results. 

The populations used in these samples lege 
Education juniors at Los Angeles State Col was 
The number of cases included in Sample TWO ver- 
110 and in Sample Three 86. There was по The 
lapping of personnel between the samples. | ter- 
IBM equipment was not used to secure the ap 
correlations, but the method of Flanagan, РГ 
ously mentioned, was employed. 


were 


0 
2. Description of the Variables of Sample TW 


Eight variables were used in Sample Two; D), 
a rating on authoritarianism, using a m cai one 
Adorno “F” scale, 2)a rating on re 
omic status of parents of the type used by ives)? 
(ratings were made by respondents iie niei 

3) a sample of the Tp scale containing abou ining 
the items, 4) a sample of the Sc scale am 
about half the items, 5) a sample of the Tn Š 
containing about half the items, 6) a Sample ) the 
the D scale containing about half the items, and 
Minnesota Teacher Attitude Inventory ECON T 
8) intelligence as measured by the Army yon "m 
Classification Test. Intercorrelations of t 
variables are shown in Table XII. 


3. Results of the Factor Analysis 
oes 01 the Factor Analysis 


ja 
Three factors were extracted. Table X 


GOWAN 


TABLE IX 


(Factor Analysis H) 


FACTOR LOADINGS FOR MATRIX OF CERTAIN VARIABLES IN E, F, G SERIES* 


Independent Variable 


Eb 
Ec 
El 
E3 
E4 
E5 
EG 
E8 
E9 
E10 
E14 
E15 


Fi 
F3 
F5 
F7 
F8 
F9 


G1 
G2 


* Decimals omitted throughout. 
**Correlations between the rotated factors: I and II, . 00; I 


Gi: 


Ds: 
Re: 
F1: 


St: 


Do: 


Sp: 


De: 


Ie: 


Ac: 
X1: 
X2: 


аз тчыш>ьо 


Good Impression 
Dissimulation 

Social Responsibility 
Flexibility 

Status 

Dominance 

Social Participation 
Delinquency 
Intellectual Efficiency 
Academic Achievement 
Poise, Self-confidence 
Impulsivity 


General Energy 
Ascendance 
Stability 
Friendliness 
Thoughtfulness 
Personal Relations 


Teacher Positive 
Teacher Negative 


Unrotated Rotated 
п ш h? I II IH 
-55 02 72 76 -50 -31 
20 34 66 -87 -05 46 
-18 -21 41 30 70 -83 
06 56 33 30 70 -83 
44 14 55 78 58 00 
50 -18 52 68 20 60 
53 -13 68 75 26 45 
32 35 25 -36 -93 -30 
29 06 60 90 33 00 
-12 -20 42 04 -35 -10 
74 32 80 40 84 00 
74 11 80 -47 07 30 
34 -27 20 25 05 88 
39 -11 38 76 20 42 
-31 -09 50 89 -46 -10 
-45 45 170 65 -03 -75 
26 -37 21 10 77 40 
-27 26 50 85 -06 -47 
-01 -09 52 99 -01 -09 
10 08 40 -98 05 00 


For further identification 


of the variables, see Table I. 
and HI, .00; II and П, .50. 
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TABLE X 
(Factor Analysis HI) 


FACTOR LOADINGS FOR MATRIX OF TEN SELECTED VARIABLES* 


Unrotated Rotated** 
Independent Variable I п ш m I H ш 
B3 American Council Psychological Total 15 17 -04 05 65 10 60 
Dc K Suppressor Variable on MMPI 74 -33 -06 66 75 -55 10 
D7 Pt: Psychasthenia on MMPI -07 -36 42 31 -67 -98 10 
Eb Gi: Good Impression on CPI 60 -47 -39 73 75 -30 -25 
E3 Fl: Flexibility on CPI 51 -04 57 58 02 -92 72 
E4 St: Status on CPI 61 42 14 56 55 -30 82 
El4 ХІ: Poise, Self-confidence on CPI 50 68 12 72 42 -05 80 
Fl G: General Energy оп G-Z -01 41 -35 29 35 85 00 
FT F: Friendliness 66 -38 -08 58 73 -60 05 
G1 Tp: Teacher Positive on MMPI 67 06 -43 63 99 -01 01 


* Decimalsareomitted throughout, For further identification of variable, see Table I. 
**Correlations between the rotated factors: I and II, .00; I and III, .00; II and HI, .50. 


TABLE XI 
(Detail of Factor Analysis I) 


NORMALIZATION OF FIRST THREE FACTORS FOR CERTAIN VARIABLES 
OF FACTOR ANALYSIS I* 


Independent Variable I H HI nh? I H ш Beta" 
B3 American Council Psychological Total 93 -02 -01 86 99 -02 -01 

Dc К: Suppressor Variable on MMPI 04 80 18 67 05 98 22 809 
D1 Hs: Hypochondriasis -09 66 34 55 -12 89 45 1190 
D2, Bi Depression -08 13 63 42 -12 20 97 168° 
D3 Hy: Hysteria -08 74 17 57 -10 98 22 100° 
D4 Pd: Psychopathic Deviate -05 56 32 41 -08 88 50 120° 
D6 Pa: Paranoia -08 44 19 93 -16 91 40 113? 
07 Pt  Psychastenia -08 39 67 60 -10 50 86 1500 
D8 Sc: Schizophrenia 00 60 49 60 00 77 63 1309 
D10 Si: Social Introversion 02 -35 56 43 03 -54 86 2100 
D11 Do: Dominance 22 33 -51 41 34 52 -80 33° 
D12 Re: Social Responsibility 08 47 -33 33 14 82 -58 58? 
різ St: Status 15 32 -49 36 25 53  -82 32° 


* D j š 
ecimals are omitted throughout. For further identification of the variable, see Table I. 


**If the normalized loadin | 3 
à gs for the three fact : j j j 1 (aap) 
= 0, approximately (or is less than .1), ag пови olen a (87 1800). Yin other von 
З "as parameter angle along which the M. 

nd 2. 


lo Sin B, and азі = cos (B - 1800). ‘In other woe 
I scales seem to be distributed. (See Figures 


Figure 1, 
Diagram of Positive Hemisohere, MMPI 
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Гоше 2, ¿can through Great Grele LZ (61) 
Showing Parameter Beta in 
Relation éo MMII Seales 
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TABLE XII 


INTERCORRELATIONS OF VARIABLES OF SAMPLE 


TWO (LOS ANGELES STATE COLLEGE)* 


17 


Code and Scale 


H1 Authoritarianism from Modified Adorno Scale 
Il Socio-economic Status of Parents (Sims type) 
G1 Tp: Teacher Positive (MMPI)** 

D8 Sc: Schizophrenia (MMPI)** 

G2 Tm Teacher Negative (MMPI)** 

D2 D: Depression (MMPI)** 

J1 Minnesota Teacher Attitude Inventory (MTAI) 
Kı Intelligence: Army General Classification Test 


п G1 
-05 -27 
20 


D6 


37 
-20 
-65 


G2 


15 
-28 
-58 

60 


D2 


10 
-15 
-22 

3l 

34 


Ji 


-61 
21 
31 

-31 

-47 

-28 


K1 


-41 
-14 

33 
-24 
-42 
-30 

54 


* Decimals are omitted throughout. For identification of the scales, consult Table I and introductory 


context to Table XII 
**Represents only a sample of the scale named. 


TABLE XIII 
(Factor Analysis IV) 


FACTOR LOADINGS FOR MATRIX OF TABLE XII (LOS ANGELES STATE COLLEGE)* 


Independent Variable I 
Hi Authoritarianism (Modified Adorno) 28 
п Socio-economic Status of Parents -35 
Gi Tp: Teacher Positive (MMPI) -10 
D8 Sc: Schizophrenia (MMPI) 10 
G2 Tn: Teacher Negative (MMPI) 19 
D2 D: Depression (MMPI) 53 
Ji Minnesota Teacher Attitude Inventory -65 
K1 Intelligence: (AGCT) 58 


Unrotated 
II ш 
79 -17 
16 53 
13 -08 

-63 30 

-09 00 

-03 -03 

-50 -06 

-40 -49 


Rotated 

I H IIp** 
-30 90 10 

54 -10 90 

98 25 10 
-74 -74 00 
-97 20 -15 
-99 -05 -12 

80 -37 -24 

65 -10 -65 


ж Decimals are omitted throughout. For further identification of variables refer to Table I or 


Table XII. 


**Correlations between the rotated factors: I and II, -.09; I and HI, -.09; H and IN, .03. 
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TABLE XIV 


INTERCORRELATIONS OF VARIABLES OF SAMPLE THREE (LOS ANGELES STATE COLLEGE)* 


Code and Scale 


L1 
L2 
L3 
L4 
L5 
M1 
J1 

Gl 
Gz 


L2 L3 L4 
Bell Adjustment Home 56 44 59 
Bell Adjustment: Health 30 68 
Bell Adjustment: Social 62 
Bell Adjustment: Emotional 
Bell Adjustment: Vocational 


RAPH: Rigidity 
MTAI: 


Teacher Attitude 


Tp: Teacher Positive (MMPI) 
Tn: Teacher Negative (MMPI) 


L5 


31 
-02 
20 


33 


M1 


-19 
-03 
-07 
-24 
-21 


J1 


13 
-02 
-03 

24 

22 
-73 


G1 


36 
38 
49 
42 
30 
-37 
60 


*Decimals are omitted throughout. For identification of the scales, consult Table I and introductory 
context to Table XIV. Maximum standard error of R is .11. 


TABLE XV 


(Factor Analysis V) 


FACTOR LOADINGS FOR MATRIX OF TABLE XIV (LOS ANGELES STATE COLLEGE)* 


Ll 
L2 
L3 
L4 
L5 
M1 
л 

Gl 
G2 
Hl 


Unrotated BE — 
Independent Variable I H ш h I п Um 
Bell Adjustment Home 65 -36 11 56 y um "B 
Bell Adjustment: Health 53 -53 42 74 00 26 80 
Bell Adjustment: Social 60 -41 -35 65 -55 -09 91 
Bell Adjustment: Emotional 71 -40 -22 80 -10 -34 95 
Bell Adjustment Vocational 47 13 -31 34 ge uo Ай 
RAPH: Rigidity -53 .-54 +08 56 -63 94 07 
MTAI: Teacher Attitude 56 65 26 80 78 -56 01 
Tp: Teacher Positive (MMPI) 77 19 -01 63 76 00 00 
Tn: Teacher Negative (MMPI) -75 03 -26 62 -39 59 -40 
Authoritarianism (Adorno) -49 -34 33 46 oo 87 -15 
is omitted throughout. For identification of variables see Table XIV. 
Correlations between the rotated factors: I and П, .47; I and In, .53; H and Ш, -. 24. 
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Figure 5, 
Diagram Ф the ‘Positive * Flemigohere Pooling Results of Analyses 
(Subscripts indicate factor analysis which located point. 


: means antipodies of the vector point of contact with sphere- 
Lines connect substantially similar measures.) 
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shows the unrotated and rotated loadings for ob- 
lique simple structure. The first factor seems 
to be General Teaching Adjustment, the second 
Authoritarianism and the third related to Status. 


4. Description of the Variables of Sample Three 


The ten variables of sample three were four 
of those used in Sample Two and six new ones. 
The first five consisted of the home, health, so- 
cial, emotional and vocational scales of the Bell 
Adjustment Inventory (Adult Form). These 
scales were reversed so that a ‘‘low’’ score was 
considered high in desirability in interpretation; 
in other words, the scales were turned so that 
the desirable social result was “up”. In conse- 
quence, these scales would be expected to corre- 
late positively with a scale of adjustment andneg- 
atively with a scale of maladjustment. The sixth 
measure was the so-called RAPH scale, ameas- 
ure of rigidity of attitudes regarding per sonal 
habits developed by Meresko (8). The last four 
scales were the Minnesota Teacher Attitude In- 
ventory, the Tp and Tn scales for the MMPI, 
and the Adorno-type authoritarianism scale used 
in Sample Two. The intercorrelations of these 
variables are shown in Table XIV. 


5. Factor Analysis Results 


Three factors were extracted. Table XV 
shows the unrotated and rotated loadings for ob- 
lique simple structure. The first factor again 
seems to be General Teaching Adjustment, and 
the second again Authoritarianism. The third 
seems to be most closely relatedto the Bell Emo- 
tional Scale. 


6. Collated Results 


When figures were drawn to representthe 
several factor analyses and were compared, it 
appeared that most of them displayed com mon 
factor space. Itis recognized that much of the 
variance of the variables cannot be expressed in 
three dimensions, yet there seemed to be enough 
of a “соттоп view” to make it worthwhile to su- 
perimpose the diagrams. This has been done in 
Figure 3, which shows the vector intersections 
of selected scales with the positive hemisphere 
pooling the results of a number of different fac- 
tor analyses. The Tp point has been used to lo- 
cate the center pole, and the authoritarian pole 
is at the extreme left, so that the vertical axis 
is its great circle. Lines have been drawn be- 
tween variables which purport to measure the 
same thing. Subscripts indicate which factor 
analysis was involved. 1t will be noted that 
there is considerable uniformity in the position 
of the points, even as between different factor 
analyses. The angle between authoritarianism 


4 4-19 
\ g 20?) 21 


and intelligence, for example, seems to be about 
135 degrees. The 30 degree апа 60 degree small 
circles have been drawn, and various areas of the 
circumference great circle have been named. 
From Figure 3, Table XVI has been construct- 
ed. This table gives in very rough form estima- 
tions of the correlations between selected clusters 
noted in Figure 3. It also gives the angular devi- 
ation measured from the pole between these clus- 
ters. Such an arrangement provides a kind of 
circular coordinate system. It is to be empha- 
sized that measurements are rough only and are 
therefore inexact. The arrangement of these vec- 
tors in the factor space is perhaps made more 
understandable by such a procedure. At least, 
their relationships to each other in the common 
factor space becomes more apparent. It is the 
contention of the writers that the existence of this 
common factor space as revealed by several fac- 
tor analyses of different tests on different popula- 
tions helps to further understanding with regard 
to the interrelationships between these variables. 
The writers believe further interpretation 
should await corroboration by others and further 
exploration. They are aware of the rough and of- 
ten informal methods utilized with some of the da- 
ta and of the incompleteness of many of its parts. 
It was Thurstone himself who said, inthis regard: 


The exploratory nature of factor anal- 
ysis is not often understood. Factor an- 
alysis has its principal usefulness at the 
borderline of science.... These new 
methods have a humble role. They en- 
able us only to make the crudest first 
map of anew domain. But if we have sci- 
entific intuition and sufficient ingenuity, 
the rough factorial map of the new domain 
will enable us to proceed beyond..... 
(11:56) 


It is in this light that these explorations are of- 
fered. 


Summary 


This paper reports the intercorrelations and 
resulting factor analyses from giving extensive 
testing batteries to teaching candidates. The ma- 
jor work sample was at UCLA, involving numbers 
ranging upwards to 1700 subjects and scales on 
Cooperative English, Stanford Arithmetic, Amer- 
ican Council Psychological, Allport Study of Val- 
ues, Minnesota Multiphasic, California Psycho- 
logical, Guilford- Zimmerman Temperament, 
and two scales on Teaching Prognosis devisedby 
the writers. The minor work samples included 
two groups of about 100 subjects at Los Angeles 
State College, involving intelligence, status, au- 
thoritarianism, Minnesota Teacher Attitude In- 
ventory, Bell Adjustment Inventory, and the pre- 
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viously mentioned teaching scales. 

Results of the factor analyses seemed to show 
a common factor space, and helped to clarify the 
relation of other generally used variables to this 
measure oí teaching potential. Further investi- 
gations appear in order. 
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MANY EDUCATORS and psychologists have 
come to believe that the efficient way to esti- 
mate the internal consistency of a measuring in- 
strument is to divide it into halves, score each 
half separately, correlate the pairs of scores so 
obtained by means of the Pearson product-mo- 
ment coefficient, and correct this value for the 
fact that it is based on halves instead of wholes. 
This procedure was developed independently by 
Spearman (13) and by Brown (2) for the solution 
of certain test construction problems. Itis also 
widely believed that the efficient way to e st i- 
mate the stability of an instrument is to adminis- 
ter equivalent forms to a sample of subjects and 
correlate the two sets of scores so obtained. It 
is the purpose of this paper to suggest the effi- 
ciency of the analysis of variance technique for 
estimating the reliability of educational meas- 
ures, and to illustrate use of the technique on ob- 
servations of teachers’ classroom behavior. 

The analysis of variance has been suggested 
as a method for estimating test reliability by 
Jackson (6,7), Hoyt (5) and Alexander (1). Its 
use for estimating the reliability of pue as- 

i o compositions or essay examinations 
со вер Pilliner (11). Lindquist (8: 
357-82) has recently presented а rather com- 
plete discussion of the general use of the analy- 
sis of variance technique for reliability esti m a- 
tion in educational and psychological mea sur e- 
ment. The method is particularly well adapt- 
ed to observational data, as Lindquist remarks, 
but concrete examples of its proper use are not 
available in the literature. 

In connection with a longitudinal study of 
teacher education graduates of the Ne w Yo rk 
City municipal colleges (14), the reliability of 
two observational techniques for assessing 
teachers' classroom behaviors was studied. 
The method of estimating reliability used in 
this longitudinal study will be described briefly, 
and its application to the classroom observation 
data will be illustrated. Methods of com puta- 


tion will not be given since they are readily acces- 
sible in other sources; emphasis will be on the 
logic of the procedure and the interpretation of re- 
sults. 


Method of Analysis and Definition of Terms 


Suppose that N teachers are visited m times 
each byateam of n observers. Each teacher will 
be assigned a score on the dimension of interest 
by each observer on each visit, yielding a total 
of mn scores per teacher and a grand total of 
Nmn observations for analysis. 

Among the factors which may be expected to 
produce variation among the scores are two: dif- 
ferences among teachers and differences am ong 
visits. For convenience, Ti will be used to rep- 
resent the deviation (from the mean of all obser- 
vations) associated with Teacher i, and V; willbe 
used to represent the deviation associated with 
Visit j. It is understood that T; will be the same 
for Teacher i on every visit and Vj will be the 
same for all teachers on the jth visit to each of 
them. 

H Pij is the performance of Teacher i on visit 
j, it is probable that 


Pijz Ti+ Vj 


In other words, there is likely to be an ‘‘inter- 
action” between visits and teachers— some teach- 
ers may do better on the first visit than on any 
other; other teachers may do better on the last 
visit, etc. Therefore, let 


lij = Pij - (Ti + Vj); (1) 


Ijj is the interaction component of a teacher's 
score on a particular occasion. 

When a particular observer k visits a particu- 
lar teacher i on a particular occasion j, the 
Score Xjjk (taken as a deviation from the mean 


of all values of Xijk) that that observer assigns 
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to the teacher may not be identically equal to 
Pij the actual performance of that teacher on 
that visit. Define 


eijk = Xijk - Pij 
= Xijk - Ti - Vj - lij @) 


The “error” eijk will include all parts of the 
Score not otherwise accounted for in equation 
(2). 

Ijj will be referred to as the visit error for 
teacher i on visit j; eijk as the residual or ob- 
Server error for teacher i on visit j in observa- 
tion К. Error is thus partitioned into two parts 
—one containing errors due to lack of stability 
in teacher performance, and the other contain- 
ing all errors independent of such lack of stabil- 
ity. (The latter component is referred to as 
"observer" error because it will show upinthe 
discrepancy between two records of the same 
performance made by two different observers.) 

If (2) is rewritten as follows: 


Xijk = Ti + Vj + Iij + ек, 


and if both sides are Squared and the operation 
of taking mathematical expectations in the popu- 
lation (generated as N, m, and n all approach 


infinity) is performed, the result may be writ- 
ten: 


Ox? = 0t2 + Oy? + Oty? + ог, (3) 


where ox? is the total variance for all observa- 
tions X, ot? is the variance of the Ti, oy? of the 
Vj, Oty? of the Ijj, and c? of the @ijk, in their 
respective populàtions. 

What is meant by the “reliability” ofascale 
depends on what true score is of interest, since 
the error in a score is the ifference betweenit 
and the true score it estimates. As will be 
Shown, the reliability of a scale as a measure 
of Pij will generally be greater than its reliability 
as a measure of Ti. Inthe present instance, the 
true score of interestis Ti, the mean of all per- 
formances Pjj of teacher ionall occasions j on 
which a visit might be made to the teacher. Ideally, 
the population of visits j should include all possible 
situations that arise during a teacher’s career, 
More realistically, it should include all situa- 
tions during a particular school year or te rm; 
this could be approximated by use of proper 
sampling procedures in selecting the tim es at 
Which observations are made. Then Ti would 
represent the “typical” períormance oí Teach- 
er i. 

Similarly, the nature of the population of 
teachers is not clearly perceived unless the 
teachers observed are drawn at random from a 
Specified population of teachers. It will be as- 
sumed, however, that both populations do exist, 


whether or not they can be specified. | А 
The reliability of а score based оп a single o 
servation may then be defined as 


R = ot2/ (oq? + Oty? + 02) (4) 

The numerator on the right is the variance of | 
Ti; that is, the “true score” variance. The F 
nominator is the sum of the true score паш 
апа the two error уагіапсеѕ. Comparison MR | 
equation (3) reveals that this sum represents ent 
total variance of the scores with the co go Hee 
due to visit differences, суг, removed. TM 
iance is removed because we will compare te E 
ers who have all been visited equally often, pat 
the scores will be means over all visits so t et 
the visit effects are cancelled out. The дер ЕР 
ator in equation (4) is the total variance of t ti- 
scores about their mean. The reliability coe й 
cient is thus seen to be the ratio of true sc ut | 
variance to total variance, or, inother words, dif- 
proportion of the total variance attributable to | 
ferences among teachers. eter 

This reliability coefficient R is the peo Р 
that is usually estimated by correlating x. - | 
Signed to a set of teachers by two observere ath- | 
iting the teachers at different times. This ctory | 
od of estimating reliability is quite unsatisía can | 
however, since only two scores per teacher ш | 
be used, with the result that the estim pedet 
very low precision when the number of tea 
is small. m t that 

A second type of “reliability” coefficlen ors” 
is sometimes used regards Pij, the true pe е t? 
ance of teacher i on visit j, as the true sco? 


re 
be estimated. This coefficient will be геѓеггё ү, 


to here as the coefficient of observer agree 
R', and may be defined as follows: 


(5) 


R' = (042 + oq 2)/(o4? + apy? + 02) 


rm 

In this case, fluctuations in teacher part ce, 
ance are regarded as part of true score y 100 
Since they are capable of being observed БУ js 
Servers present on a particular occasion. elatiDÉ 
coefficient may also be estimated by corr twi ob” 
scores assigned to a group of teachers by ime 
Servers visiting each of them at the same ree і? 
it is a measure of observers’ ability to ag doef 


to as a reliability coefficient. ber of. 
The reliability of the méan of a num sily де 
Scores assigned to {һе same teacher is I es 
rived. In terms of the observer team s= 
the number of visits, m, it is Rmn, whet © 
Rmn = (тпо{2)/(тапо{® + noty? + 02) 
е 
пр at 
If it is assumed that Tj, Vj, 1}, and eijk 
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normally and independently distributed in repeat- 
ed random sampling with zero means and vari- 
ances ot2, сү2, Oty? and o2, respectively, then 
the values of these variance components and 
hence of the coefficients R and R' may be esti- 
mated from an analysis of variance table of the 
form shown in Table I. 

Table I is based on samples of N teachers, 
m visits, and teams of n observers each. The 
Observed mean squares and their expectations 
in terms of variance components are shown at 
the right. Estimates of tlie variance components 
of interest may be obtained from the observed 
mean squares as follows (the symbol “(=)” may 
be read, “is estimated by"): 


MinT m 
Oty? (=) (8, - 5 п 
oU (=) (a = my) / mn 


By substituting these estimates and the ap- 
propriate values of m and n in equations (4) to 
(6), estimates of the coefficients of reliability 
and of observer agreement secured in a given 
experiment may be obtained. 

It is also possible to test the hypotheses, 


Ho: ot2 = 0 
and 
Н,: оу? = 0 


Hypothesis Ho states that the scale fails to 
discriminate among teachers; hypoth esis H, 
states that there is, on the average, no greater 
variation between two records based on differ- 
ent visits than between two records basedon the 
same visit. 

R, is tested by comparing 


F, = Sty? / 52 


with Snedecor’s F distribution (10:222-225) with 
degrees of freedom n, = (№ - 1)(n - 1) and n;= 
Nm(n- 1). If H, is accepted, it is concluded 
that oty? = 0, and Table I is superseded by an 
analysis of the form shown in Table П. For oty? 
in equation (3) to (6) a zero is substituted, and 
the estimation equations for the variance com- 


ponents become: 


m imus 
oe 6) (se - se?) / mn (8) 


Since oty? = 0, Hg is tested by comparing 
Fo = st? / se? 
with the tables of the\F distribution with degrees 
of freedom n, = (N - 1) and пг= N(mn- 1)-(m - 1). 


If Н, is rejected, it is concluded that oty^»0, 
and Hg is tested by 


F, = 52 / sty? 


with the F tables with degrees of freedom n, = 
(N - 1) and n, = (N - 1)(m - 1). 

If Ho is accepted, it is best to conclude that 
the reliability of the scale is zero; if Ho is reject- 
ed, it is proper to estimate R as indicated in equa- 
tions (4) and (7). 


Application of the Design to Tryouts of the 
Cornell Technique 


The first of the two techniques employed in this 
study was developed by Francis G. Cornell and 
his associates at the University of Illinois. For 
the purposes of this investigation, Cornell’s tech- 
nique was modified slightly; readers interested 
in the original form should consult the monograph 
in which it was originally presented (3); the modi- 
fied form is described in a monograph by Medley 
and Mitzel (9). 

Six observers participated in thetryouts. They 
visited 33 teachers in teams of two observers 
each. Each of the six observers saw each of the 
33 teachers once, so that the total number of 
Scores on each of the eight dimensions was 198. 
The six observers were grouped into one set of 
three teams and the first eleven teachers were 
visited by each team, no two teams visiting the 
same teacher on the same day. The six observ- 
ers were then rearranged into a different set 
of three teams, and eleven more teachers were 
visited. Finally, the team composition was 
changed a third time and the remaining eleven 
teachers were visited. 

The simplest way of analyzing these data is to 
regard each series of visits to eleven teachersas 
a distinct tryout. In this case, №= 11, m- 3, and 
n=2, The design in Table I could be used to an- 
alyze the results of each tryout separately. 

If it is assumed that all 33 teachers may be re- 
garded as having been drawn at random from the 
same population of teachers, and that all of the 
nine teams used may be regarded as having been 
drawn at random from the same population of 
teams, then it is reasonable to expectthat the cor- 
responding observed mean squares in different 
tryouts estimate parameters of the same popula- 
tion, and the respective sums of squares may be 
pooled to yield more precise estimates of the par- 
ameters. Since there are 198 scoresinall, yield- 
ing a total of 197 degrees of freedom, and since 
each tryout employs 66 scores, yielding 65 de- 
grees of freedom per tryout, or a total of 195, 
there are two degrees of freedom remaining. 
These two degrees of freedom may (under the as- 
sumption stated) be used to estimate the **teach- 
er" mean square from differences between 
groups of teachers, making a total of 32 degrees 
of freedom available for this purpose. The com- 
plete design is shown in Table III. 
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TABLE I 


PLAN FOR RFLIABILITY ANALYSISOF VARIANCE OF 


SCORES ON A BEHAVIORAL DIMENSION 


Source of 

Variation d.f 

Teachers N-1 

Visits т - 1 

Visit Error (N - 1)(m - 1) 

Observer Error Nm (n - 1) 
Total Nmn - 1 


TABLE II 


Mean Squares 


vU TSS 


Observed Expected 
5ү2 02 +опоүу; + mno(^ 
Sy2 02 + noqy2 + Nnoys 
Sty2 02 + nOty2 
s2 c? 


PLAN FOR RELIABILITY ANALYSIS OF SCORES ON A BEHAVIORAL DIMENSION WHEN 


THERE IS NO INTERAC TION BETWEEN TEACHERS AND VISITS 


——— 


Expected 


— Mean Squares 
Variation d.f Observed 
Teachers N-1 Si2 оё + mnot2 
Visits m-1 Sy2 оё + Nnoy2 
Error N(mn - 1)- (m - 1) Se2 о? 

Total 


Nmn-1 
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TABLE HI 


COMPLETE DESIGN FOR ANALYZING THE SCORES OBTAINEDON ACORNELL 
SCALE IN THE SERIES OF THREE TRYOUTS 


Mean Squares 


Source of 
Variation d. f. Observed Expected 
Teachers 32 st? о2 + 20ty2 + 6ot2 
Visits 6 Sv2 0? + 20ty2 + 220y2 
Visit Error 60 Sty2 с? + 20ty2 
Total 197 
TABLE IV 
RELIABILITY ANALYSIS OF DIFFERENTIATION SCORES 
Source of 
Variation и d. f. Sum of Squares Mean Square 
Teachers 32 1828. 3132 57.1348 
Visits 6 149. 9696 24. 9949 
Visit Error 60 1044. 6970 17.4116 
Observer Error 99 664. 5000 6. 7121 


Total 197 3687. 4798 
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As an example, the complete analysis of var- 
iance for the Differentiation scale, based on the 
three tryouts, is shown in Table IV. 

The test of H, (оу? = 0) yielded an F ratio 
of 2. 59, which is beyond the . 01 point of the ta- 
bled distribution; it was, therefore, concluded 
that oty? is greater than zero, and that varia- 
tions in teacher performance from day to day 
are a factor contributing to the error in Differ- 
entiation scores. 

The test of Но (ot? = 0) yielded an F ratio of 
3.28, which is also beyond the . 01 point. It was, 
therefore, concluded that o4? is greater than 
zero, and that the Differentiation scale discrim- 
inates teachers with a reliability greater than 
zero. 

When estimated according to equation (7),the 
components of the variance of a score were tak- 
en to be as follows: 


o? (=) 6.71 
оу? (=) 5.35 
oi? (=) 6.62 


The estimated reliability of а Differentiation 


Score based on a single record for one 25-min- 
ute visit is: 


r = (6. 62)/(6. 62 + 5.35 + 6. 71) = .35 


and the estimated coefficient of observer agree- 
ment is: 


r' = (11. 97)/(18. 68) = . 64 


The r of .35 indicates that 35 percent of the var- 
iance of the scale is due to differences among 
teachers; 65 percent must, then, be due to er- 
rors of measurement. From the estimated com- 
ponents we calculate that 29 percent of the vari- 
ance is due to visit-to-visit variations in teach- 
er behavior, and 36 percent to discrepancies be- 
tween different observers’ records of the same 
behavior. 

A similar analysis was carried out for each 
of eight scales employed in Cornell’s technique. 
The results are summarized in Table V which 
gives the estimates of variance components and 
the coefficients of reliability andobserver agree- 
ment. 

Three of the scales did not detect differences 
among the teachers in this study—Pupil Climate, 
Pupil Initiative, and Content. In the instances 
of Pupil Initiative and Content there is evidence 
that observers were able to agree on score s 
based on a single performance to the extent 
necessary to achieve correlations of .43 and .23 


ee 
*When these curves were plotted, the es 
(7), whether or not the component coul 


P 10 
timate of each component was computed according to equat 
d be shown to be different from zero. 


respectively; but there was so much variation 
from one performance to another that no stable 
difference among teachers was detected. The 
Pupil Climate dimension was not observable by 4 
the six observers employed—no two of them coul 


4 apro r- 
agree about the score to be assigned a given ре 


formance observed by both. . 

There are two scales—Variety and Teach Re 
Climate—for which no error due to instability a 
teacher performance was detected. This Pur 
gests that the performance of teachers in thes 
respects is relatively stable. а 

The reliabilities of the best five scales x 
all of the same order of magnitude, ranging fro A 
.32 to .42. None of these values seems to be — 
enough to be used for estimating the typical t 
of an individual teacher. However,itis appare 
from equation (5), which may be written: 


r = ot? / (o4? + ot?) / m + 0? / mn] 


ару 
that by increasing either n (the number of obse". 


i ass 
ers on a team—that is, the number in the cla 


i «stg ma 
room at one time) or m (the number of visits nl“ d 


| 


x ase? 
to the classroom) the reliability can be increas” | 


If m is allowed to increase without limit while ^j; 
|| 


remains the same, Rmn approaches a value P 
ler than one, because increasing n reduces О oth 
er errors only, while increasing m reduces. 
types of error. The question of an optimal ours 
of distributing a fixed number of observer-h 
—that is, how large to make n when the num рей 
of observer hours mn is fixed— may be ansW 

on the basis of the graph shown in Figure 1- ell 


n 
Figure 1 shows the reliability of five CO? pv 


Scales as a function of n, the number of ae 
ers visiting a teacher at the same time, W * 


у 
(the number of observations) is equal to twel ime 


š pae rata bl 
Thus, if one observer visits one teacher at 2" jt^ 


е Í vi 
twelve visits must be made, each observer `" rv“ 


ing the teacher on a different day; if two ope ust 
ers visit the teacher at one time, six visits " ig 
be made—and so on, up to the case n= 
which all twelve observers must visit the 
at one time. 

The curves in Figure 1 show unmistak 


ey 
teach? 


ably, 


S0 y 
that the reliability of each of the scales fall? < op 


as team size increases. For a given cost ү ne 
Server-hour, there seems little doubt tha ed bY 
greatest precision per dollar spent is eer 
sending observers into classrooms one by 9 the 
It might be remarked in passing that Е the 
same observer visits a teacher twelve time ИА е? 
reliability is substantially higher than се 
twelve different observers visit the teacher an? 
each, because in the former case 0? = 0, Gow’ 
hence the quotient in equation (5) is greater. 


n 
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TABLE V 


SUMMARY OF RESULTS OF RELIABILITY ANALYSES OF EIGHT CORNELL SCALES 


Components oí Variance 


Total 
49.65 


3.64 


Coefficient 
Reliability of Observer 
Coefficient Agreement 
.41 . 63 
.42 .42 
. 00 .00 
. 32 . 32 
. 37 . 66 
.35 . 64 
.00 .43 
.00 .23 


True Visit Observer 

Scale Score Error Error 
Activity 20. 49 10. 53 18. 63 
Variety 1.51 0. 00 2.13 
Pupil Climate 0.00 0.00 1.85 
Teacher Climate 2.15 0.00 5.78 
Social Organization 3.82 2.96 3.52 
Differentiation 6.62 5.35 6.71 
Pupil Initiative 0.00 7.26 6.43 
Content 0.00 10.34 34.11 
pn—————————————————————————————n 

TABLE VI 


ANALYSIS OF VARIANCE OF SCORES FOR WITHALL'S CATEGORY 1: 
LEARNER-SUPPORTIVE STATEMENTS 


Source of 

Variation d. f. 
Teachers 3 
Visits T 
Visit Error 21 
Observer Error 32 


Total 63 


Mean Squares 


Observed 
81.015 
98. 623 
24. 468 


1.703 


Expected 
c? + 20ty2 + 160,2 
оё + 20ty2 + Boye 
c? + 20ty2 


в? 


„и — —  — е езе Ó— 
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RELIABILITY 


DIFFERENTIATION ____ 
SOC. ове. staan: 
VARIETY 

ЖИИ „ишь 
TE CLIMATE __ 


TEAM SIZE 


| FIGURE 1 
The Reliability of Certain Cornell Scales 


аз o Function of Team Size when the 
Total Number of Visits is Twelve 
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ever, such reliability is probably gained at the 
expense of validity, because observer biases 
are not cancelled out, but remain to distor.t 
teacher differences. 

Figure 2 shows the reliability to be expected 
for any number of visits (by a single observer) 
up to twenty. The rate of increase varies for 
different scales, but all of them level off. Ifa 
reliability of . 90 is required for a part icular 
purpose, sixteen visits would have to be made 
if Differentiation, Social Organization, Variety, 
and Activity are to be scored. Twenty visits 
will bring Teacher Climate scores up tothislev- 
el also. 


Application of the Design to the Tryouts 
of Withall's Technique 


The second of the two techniques employed 
was based on Withall's categories of verbal be- 
havior (15). The procedure used has been de- 
Scribed elsewhere (8). The method consists in 
classifying the statements made by a teacher in- 
to seven mutually exclusive categories: 1. Learn- 
er-supportive; 2. Acceptant or clarifying; 
3. Problem-structuring; 4. Neutral; 5. Direc- 
tive; 6. Reproving, disapproving or disparaging; 
and 7. Teacher-supportive. The firstthree cat- 
egories were combined and the category obtained 
was called ‘‘Learner-centered’’ statements. 
The ratio of the sum of these three statem ent 
categories to the total number of statements 
made by a teacher is called the **Climate Index." 

The tryouts with the Withall technique em- 
ployed two observers working as a team with 
four teachers in a single elementary school. 
Each teacher was visited by the team of two ob- 
servers on four occasions about a week apart. 
The observers remained in theclassroom dur- 
ing each visit until approximately 100 statements 
had been classified. After comparing notes and 
clarifying the definitions of the categories, the 
same two observers visited each of the four 
teachers four more times at one-week intervals. 
Thus, there were available, finally, a totalof 64 
counts in each category— corresponding to eight 
visits by two observers to four teachers; in the 
notation of this report, m = 8, n= 2, N = 4. 

The count for each category for one period 
was divided by the total number of remarks tal- 
lied in that period. The proportion so obtained 
was then transformed to an angle measured in 
degrees by the use of the arc sine transforma- 
tion (12:449-50). Such scores have the advan- 
tage of having standard errors of measurement 
which are independent of the magnitude of the 
Score. 

The design used for analyzing each category 
of response was às shown in Table VI, which 
also gives the results for Category 1: Learner- 


supportive statements. 


For these data, H, (ot? = 0) was rejected апа 
Но (012 = 0) remained in doubt, since the F ratio 
of 3. 31 falls between the . 01 and . 05 points of the 
F distribution. The components of variance at- 
tributable to observer error, visit error, and dif- 
ferences among teachers were estimated to be 
1.703, 11.383, and 4. 552 respectively. 

Similar analyses were made of the scores for 
categories 3, 4, 5, 6, and the Climate Index de- 
fined above. The results are presented in Table 
VII in the form of estimates of variance com pon- 
ents and reliability coefficients. As before, com- 
ponents not found to differ significantly from zero 
are reported equal to zero; and when ot? does not 
differ from zero, the corresponding reliability 
coefficient is reported as zero. No analysis of 
categories 2 and 7 was made since remarks clas- 
sified in these categories were so rare that an an- 
alysis of them did not promise to be fruitful. 

Two categories failed to show reliability great- 
er than zero—Neutral and Reproving; and one— 
Learner-supportive— remained in doubt. The low 
coefficient of observer agreement reported for 
Neutral statements indicates that the definition of 
this category may be unclear; that for Rep roving 
statements is high enough to suggest that the fail- 
ure of the scores to discriminate teachers is prin- 
cipally due to instability of this aspect of teacher 
behavior. This instability is also reflected in the 
relatively larger component of variance due to vis- 
it errors. The same conclusion is indi cated re- 
garding Learner-supportive statements. The re- 
maining three categories have reliability coeffi- 
cients around . 50. 

Curves like these in Figures 1 and 2 can be 
plotted for these data. When plotted, the curves 
indicated that all three of the “reliable” scales 
—Problem-structuring, Directive, and Climate 
Index— should reach a reliability of . 90 when they 
are based on ten visits. 


Discussion 


The results of the analyses presented abo ve 
illustrate some of the practical advantages of the 
analyses of variance over correlation analysis in 
estimating the reliability of observational data 
when more than two scores per person are avail- 
able. First, the analysis of variance yields a 
single estimate of the reliability coefficient which 
uses all of the information contained in the data; 
second, the analysis of variance makes it pos si- 
ble to partition the error into components attrib- 
utable to different sources; and, finally, itis pos- 
sible (if the necessary assumptions are fulfilled) 
to test the significance not only of the coefficient 
obtained, but also of each component of error. 

In the study of the Cornell technique, for ex- 
ample, no fewer than 36 correlation coefficients 
estimating the reliability of the technique could 
be obtained, all equally accurate. Each such es- 
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DIFFERENTIATION 


VISITS 


__ FIGURE 2 
The Reliability of Certain Cornell Scales 


as a Function of Number of Visits 
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timate would be a product-moment co efficient 
based on 11 cases, so that the accuracy of any 
one would be low. Moreover, each estimate 
would have a considerable bias (4:205). A mean 
of the 36 coefficients could be used, but the bias 
would remain (or perhaps increase, since the es- 
timates would not be independent); and noth ing 
is known about the sampling error in such a 
mean. The estimate obtained by analysis of var- 
iance, however, is unbiased (4:225), unique, and 
of known precision. 

In each of the examples given we have parti- 
tioned our error variance into two components, 
and shown how such a partition of errors can be 
used in planning further uses of the observation- 
al technique by indicating where some of the er- 
rors originate, and can yield estimates of differ- 
ent correlations. A different design could separ- 
ate errors due to differences in behaviors ob- 
Served on different days from differences in be- 
haviors observed on the same day; errors due to 
differences between observers from differences 
in what a given observer sees in different five- 
minute periods during the same visit, etc. A 
“reliability” coefficient corresponding to each 
type of error could be estimated, the relative 
importance of each source of error could be as- 
sessed, and plans for future observations could 
then be made more intelligently. 

When an instrument of low reliability is tried 
out on a small scale, as is usually the case when 
the instrument requires a rather large expendi- 
ture of a trained observer’s time before even 
one measurement is obtained, itis essential that 
it be possible to test the hypothesis that the true 
reliability of the scale is zero, as well as to es- 
timate its magnitude, since sometimes a relia- 
bility large enough to appear useful may be non- 
Significant. Such tests are easily made as part 
of the analysis of variance. It is also possible 
to test whether a certain suspected source of er- 
ror (as, for example, observers’ fallibility) is 
in fact making a significant contribution. 

These advantages of analysis of variance 
clearly indicate the unsuitability of the corre- 
lational method when the data available include 
more than two independent measures of each in- 
dividual. The only situation in which the latter 
method might be useful is that in which a set of 
N pairs of scores on equivalent forms of atest 
are available. Indeed, the reliability coefficient 
is often defined as the correlation betw een 
Scores on equivalent tests in the population of in- 


dividuals. It is natural to assume that the corre- 
lation in the sample is the appropriate estimate 
of the correlation in the population. 
the correlation in question is a reliability coeffi- 
cient this is not true. 

If we are correlating a test xa 
measure y, the population correla 
written as follows: 


nd another 
tion may be 


interclass correlation, which may be written: | 


Куу = Oxy / Ox су, 
where ox and oy аге the standard deviations dn | 
two measures and oxy is their covariance. TR | 
appropriate estimate of Rxy from a sample o; 


: NÉE 8 "7 nent Or | 
pairs of scores n and y is the product-mon | 


/ 9) } 

Гху = Sxy / Sx Sy, ( 

where Sxy, Sx and Sy estimate Oxy, Ох, pem 
But if we are correlating a test x and an 

alent test x' , the population correlation is 


2 
Rxx' = Oxx! / Ox?, 


where cx? is the variance of either test, : : 
the covariance of the two. The pnadus mem а. 
correlation coefficient used to estimate this 
be 


Txx' = Sxxt / Sx Sx' , | 


where Sy? and Sx'* are estimates of ox? fr wed 

each of the two tests, and Syy' is the estim ge?! 

covariance. It is clear that we are using ds | 

metric mean of two sample estimates to esti | 

the population variance, ox?. airs 
í am analyze the total variance of the N P? 

Of scores x and x' into two components, one 


е- | 
ad Wr weed 04 
comparisons between individuals, with M with | 
grees of freedom, and опе from comparis qe ee \ 
in individuals or “error”, with N due 
dom, we can obtain the intraclass correla 
as follows: 
(mean square between) - 
mean square between) + 
0) 
(mean square for error) ü 
mean square for error 
e 


b 
; ay 
This is an estimate of Rxx' which atio? 
Shown to be related to the interclass corr 
rxx' as follows: 0) 
г = (28x Sx' rxx' - K) / 2S4? +255: 2 +K), 
e 
" of 
where rxx', Sx, and Sx' are as defined ab 
X and х' are the test means, and 


2 
z = 1 
2МК = N(x - X')? + 28, Sy rye - Sp 7 SX d 
z a”. 
It can be shown that K is never negative, ri 
that when K is greater than or equal to erm t, 
smaller than rxx'. We may, therefore, 885 „ы 
the estimate rxy' is always greater than yj 
mate r. š 
Fisher (4:205,211 ff.) points out that TX* joef 
tematically overestimates Rxx', but that т reci 
not, and that the latter estimate is more P 


[roD | 
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than the former. The bias in rxx' is small ex- 
cept when r is small, and the difference in pre- 
cision is slight. 

The procedure usually followed in actual 
practice is to analyze the variance of the 2N 
Scores into three components rather than two; 
one for differences between individuals, with 
N - 1 degrees oí freedom, one for differences 
between test means with one degree of freedom, 
and one for residual or error, with N - 1 de- 
grees of freedom. The estimate r of reliability, 
as defined in formula (10) above, may then be 
written: 


г = 2SxSxt rxx! / (Sx? + Sy! ?) (12) 


Comparing this with formula (9) we see that r 
differs from rxx' in that it uses the arithmetic 
mean of the two sample estimates of ox? instead 
of the geometric mean. 

The internal consistency reliability of a test 
may be estimated from an analysis of variance 
of N pairs of half-test scores in either of the two 
designs described above from the formula: 


ie (mean square between) - (mean square for error) 


(Mean square between) 


Summary 


A procedure for estimating the reliability of 
Scores based on observations of behaviors was 
described, and its use illustrated in two some- 
what different situations. A discussion of the 
relative merits of analysis of variance and cor- 
relational analysis as techniques for estimating 
reliability coefficients led to the conclusion that 
the former has three distinct advantages over 
the latter. It yields a single best estimate of re- 
liability; it supplies independent measures of the 
amount of error from different sources, and it 
provides for simple, exact tests of significance. 
When only two sets of measurements are avail- 
able, an estimate of reliability may be obtained 
by correlational analysis, but it is biased and 
has a larger sampling error than that obtained 
by analysis of variance. When more than two 
sets of measurements are available, no satis- 
factory estimate can be obtained by correlation- 
al analysis. We, therefore, suggest that the 
use of the correlational technique be limited to 
validity estimation, and that the analysis of var- 
iance be adopted as the standard procedure for 
estimating reliability. 
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SECTION I 


Statement of the Problem 


IN A LARGE city school system em ploying 
over nine-thousand teachers itis readily conceiv- 


'able that there will be many differences among 


teachers in terms of their methods, philosophy, 

goals, behavior, and their attitudes toward chil- 
dren and adults. Even in a single school, wide 

differences аге not hardto find, yet most schools 

seem to operate in a fairly smooth, efficient, 
and productive manner. Looking still more 

closely, those of us who are on the “inside” of 

the educational scene have seen, in all probabil- 
ity, evidence of differences of varying degrees 
within even the smallest formal school -unit, the 
department. It is this unit, considered as a func- 
tioning group, about which our study was con- 
cerned. Specifically, the problem was to ana- 

lyze and describe what happened when a teacher- 
administrator group initiated, organized, con- 

ducted, and evaluated acurriculum improvement 
project in one department of an urban junior high 
school over a forty-week period. 


Major Hypotheses, Related Questions 


The plan and procedures of this study were 
aimed at finding answers to the following ques- 


tions: 


1. What changes, if any, occur in the teach- 
ers! perceptions of their own responsibilities in 
teaching when they become involved in coopera- 
tive curriculum improvement? 

2. What will be the outcomes of a small devel- 
opmental study which seeks tó stimulate by large- 
ly non-directive means the improvement of in- 
struction in a single department within one ur- 
ban junior high school? 

3. What conditions or factors appeared to be 


_——-—— 


influential in tending to make the teachers produc- 
tive and creative in the group? 


Five Major Hypotheses 


The hypotheses tested in this study are stated 
below. 


1. There will be tangible modifications in 
classroom practices of the teachers in- 
volved in the study. 

2. A variety of instructional methods will be 
tried and tested. 

3. There will be an increase in the confidence 
of teachers in defining problems. 

4. Teachers will feel increasingly secure in 
exchanging suggestions with each other. 

5. The Head of the English Department and the 
Principal will allow and encourage teachers 
to try, test, and develop newer methods, 
techniques, and courses. 


Related to the testing of these hypotheses, 
sonie data on the questions below werelookedfor 
and examined. 


1. What are the strong points of the action re- 
search method and the work-group-confer- 
ence technique? 

2. Under what conditions can these methods 
best be used in creating curriculum change? 

3. What are the limitations of the methods and 
their application? 


Two Assumptions Underlying the Study 


Two basic assumptions were made by the writ- 
er at the outset of the study. 


AssumptionOne: The curricular experiences 
of pupils are determined in large measure 
by the values, goals, skills, and attitudes 


*An abstract of an unpublished Ed. D. dissertation, Wayne State University, 1957. 
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held by teachers. 

Assumption Two: In order to change the cur- 
riculum it follows that there must be un- 
dertaken an attempt to change the val ues, 
goals, skills, and attitudes of the people 
involved in respect to education (2), but 
more specifically in respect to interper- 
sonal relations among members of a work- 
ing group. 


The ideas expressed in these assumptions were 
accepted and shared, at the beginn ing of the 
Study, by the Supervisor of Language Education, 
the principal of the participating school, and the 
writer. In addition, the Director of Language 
Education for the city school System also en- 


dorsed these assumptions when he activ ely 
launched the project. 


Background and Significance of Curriculum 
Development 


One of the greatest chronic problems in edu- 
cation in the United States during thepast thirty- 
five years has been the apparent lack of utiliza- 
tion of research findings by teachers in the na- 
tion’s classrooms. Tremendous quantities of 
research resultsfill library shelves and, al- 
though a great part of these research findings 
could be of inestimable value, they have been 
barely tapped. There are many reasons for this 
rejection or ignoring of research on the part of 
teachers, but, reasons or not, unless this pat- 
tern is changed the youth of our country will con- 
tinue to pay the price in the form of less ful l; 
beneficial, crucially-needed education. 

The background of one-hundred years of cur- 
riculum development is briefly and conc isely 
presented in an NEA bulletin, 100 Years of Cur- 
riculum Improvement, 1857-1957 (1 . Prepared 
by the Association for Supervision and Curricu- 
lum Development, the bulletin traces major 
changes in the concept of learning, teaching and 


curriculum improvement. Briefly, some of 
these are: 


1. The change from the facul 
learning to an organismic, dynamic psy- 

chology... with emphasis on meaning, goal- 
Seeking and integration in the learning pro- 
cess. 
Change from reliance on tradition and sub- 
jective judgment... to concern for scientif- 


ic research and the application of scientif- 
ic methods and findings. 


3. Changes in methods and materials (used 


in teaching) have grown out of the idea that 


how we learn is as important as what we 
learn. 


ty psychology of 


4. Changes in our total approach to children 
in learning situations havebeeninfluenced 


by the finuings in the field of Child Growth 
and Development. u | — 
5. Changes in patterns of participation in c 

riculum building: над» 
a) Fixed body of subject matter set by ‘е? 

perts’’ to T 
b) Shared participation—teachers, pup 

lay people led by | ve 
c) Administrators, supervisors, and 

Source persons. 


Action Research 


” 
Action research, based on the ‘‘field ен ан 
psychology of Kurt Lewin (5), is the пева 
Search approach to educational problems bec? ipm 
it has within it the potential to apply La oeil "a 
social psychology to “natural” social g ro S 
Good action research employs mathem at pen 
means of measurement and testing, алаша 
alysis and other tools of fundamental research: 


Teacher Participation 
=з a Pus tug 


ange? 
During the past one hundred years the Mp 
in **Who shall build the curriculum?” pup nt 
most marked. The current view that teach? 


results of research into the classroom. P 
istrators, teachers and supervisors have vi 
portunity now, as never before, to work him . 
cooperatively to improve all phases of educ 


Factors in Cooperative Effort 
A ES LOTT 


258“ 

To create actual improvement in the c! pem d, 
room it is imperative that teachers unders "o 
appreciate, develop and apply research. Јн tat” 
for them to do this they must be given OPP ата“ 
ity to learn, change and improve. Adm a of š 
tors and supervisors must afford this kin o 
portunity. They must structure a fram E і 
which is conducive to good personal nean ae 
gives a chance for free expression and Meri 
mental acquisition of research skills. pe will 
the administrators and supervisors must b es in 
ing to support cooperatively created c hang 
the total curriculum. 


The Work-Group-Conference Method 
16 Work-uroup-Conference Method 


num 


Meier, Cleary and Davis (6) drew on a ative 


ber of fields to create a technique of coope 5 
action which they labeled the “work-group-, new 
ference method". This method is one of th pally? 
er tools with which can be realized, сае ар“ 
а good human relations, action геѕеагс ntial5 
proach. It seems to have within it the pote ts of 
to release the creative and productive talen 
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people working and acting in harmony. It seems 
to be a technique by which supervisors, consult- 
ants, and staff generally as well as principals 
and other administrators can achieve successful 
improvement of their own and others' behavior 
and, consequently, clearer thinking and sharper 
action to reduce problems inherent in the educa- 
tional scene. 


A Means of Problem Attack, Action, and 
Developing Research Abilities 


The work-group-conference method lends it- 
self ideally to problem solving because it has a 
social-psychological basis which encompas se s 
the total aspects of the individual, the group, 
and the environment in which these operate at a 
given time. A supervisor, when facing instruc- 
tional problems, can employ the technique of 
work-group-conference method in an action re- 
search frame and lead in helping teachers to 
solve the problems in a cooperative and scien- 
tific manner. 

Stephen M. Corey lists the following as ''sig- 
nificant elements of a design for action re- 
search” (4): 


1. The identification of a problem area about 
which an individual or a group is sufficient- 
ly concerned to want to take some action. 

2. The selection of a specific problem andthe 
formulation of a hypothesis or prediction 
that implies a goal and a procedure for 
reaching it. This specific goal must be 
viewed in relation to the total situation. 

3. The careful recording of actions taken and 
the accumulation of evidence to determine 
the degree to which the goal has been 
achieved. 

4. The inference from this evidence of gener- 
alizations regarding the relation between 
the actions and the desired goal. 

5. The continuous retesting of these generali- 
zations in action situations. 


Bases of Action Research (Summary) 
Bases of Action жее ume 


1. It is based on the social dynamics theories 
of Kurt Lewin. 

2. The psychological basis of the social dyn- 
amics theory is grounded on what is gener- 
ally termed the ‘‘field’” theory-type of psy- 
chological action. 

3, Action research is usually carried out ina 
field setting in contrast to a laboratory set- 
ting. 

4, It is an extension of basic social research 

and includes in its methods the utilization 

of mathematical and conceptual problems 
of theoretical analysis. 

5. This research lends itself to immediate ap- 
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plication in on-going developmental situa- 
tions. 

6. Although it is not an inherent characteristic 
of action research that it always be a coop- 
erative enterprise, the application of find- 
ings is usually more effective if the investi- 
gator works in close collaboration with the 
persons of the agency or institution being 
studied. 


Philosophy of Cooperative, Developmental 
Improvement 


One of the greatest deterrents to research on 
the part of teachers (and also on the part of super- 
visors who want to involve teachers in research) 
is the fear, apparently, that the research will not 
conform to ‘‘high standards". Also, on the part 
of teachers, research in the traditional sense 
seems too far removed to be of fairly immediate 
help with problems of highly immediate import- 
ance. 

The work-group-conference method encour- 
ages developmental growth in teachers' research 
abilities. Within a typical teacher-administrator 
or other adult group several levels of sophistica- 
tion in research ability will usually befound. As- 
suming that the group is well led and that con- 
ditions necessary for its successful operation are 
present, the members are likely to become se- 
cure and reasonably confident in attempting to set 
up a design and try objective problem solving. 


The fact that attempts at problem solving 
fall at various points on a continuum rang- 
ing from careless, untested inquiry to 
careful and reliable research is rarely em- 
phasized. This is regrettable, because, 
although teachers and other people value 
research in the abstract, they feel that it 
has little relation to the methods they must 
employ to solve their own problems. There 
is little motivation for practical problems 
to move in the direction of better and bet- 
ter research methods. They are learned 
with practice. To refrain from trying be- 
cause one lacks skill or has perfectionist 
aspirations precludes improvement, and 
improvement is what counts. (4) 


It is against this background of supervision and 
curriculum development theory and current con- 
cepts of research in this field that the problem of 
this study took form. 


SEC TION Ii 
Structure and Development of the Study 


THE GENERAL plan of the study included: 
1) inviting teachers to participate voluntarily in 
a two-semester project; 2) enlisting the support 
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of the principal, department head, and special- 
ist supervisor; 3) providing for biweekly meet- 
ings over two full semesters; 4) providing for 
guest speakers, special materials, films, visits 
to other schools, etc.; 5) the author of this re- 
port serving as coordinator and organizer of 
group activities, encouraging members of the 
group to undertake experimentation intheir own 
classes and to report back to the group; 6) meas- 
uring teachers’ perception of roles of group's 
Status people. 

The report of the study described four phases 
of our group’s development, phases similar to 
those that are traced by Thelan and Dickerman 
in “Stereotypes and the Growth of Groups” (7). 
Because we followed the pattern of tracing major 
phases of the group’s growth, it is important to 
list briefly questions related to each phase (3). 


1. What happened to the project of our group? 

2. What happened to our group and the indi- 
viduals in it? 

3. What blocked the work of our group? 

4. What facilitated the work of our group? 


A final item under each phase was: 


5. Summation of evidence and interpretations 
of each phase, 


Methodology: Procedure and Sources of Data 
— “n sources of Data 


This project was an action research, cooper- 
ative type of study. All the participants worked 
on one major problem: the improvement of in- 
struction in English at this one junior high school 


The major features of the methodology include 
the following: 


1. Each teacher had the freedom to work on 
а specific problem in the English (Language 
Arts) area. 

2. Teachers were encouraged to work in a 
manner of their own choice: a) cooperatively on 
one problem; b) individually on separate prob- 
lems within the scope of the English curriculum; 
or c) in freely formed Subgroups on one or sey- 
eral problems. 

3. The teachers’ populations for study were 
their own pupils from one or more of their own 
classes and/or the available data on all Pupils 
(cumulative records, reading test scores, fam- 
ily background information, etc.). 

4. The writer’s population for the study was 


not the pupils but all the twelve members of the 
study group. 


Part of the methodology includes definition 


*All footnotes will be found at end of article. 


of the roles of various people in the study: 1) the 
principal, 2) the department head, 3) the super- 
visor, 4) the coordinator, and 5) the roles of 
the nine participating teachers. | 
The work-group-conference method was in- 
trinsic to the broad action research methodology 
of the whole project. The total group met араш, 
every two weeks (total of eighteen meetings intw 
semesters) after school for planning, a 
reporting, and evaluation. The average leng 3 
of each meeting was two hours and fifteen m ie 
utes. The group focused its attention on “СО | 
tent", viz., various aspects of English: ee 
grammar, composition, testing, spelling, pe | 
writing and other things. The writer was ар 
cerned with interaction, the dynamics of the gr A 
situation and any interaction between meeting? ig 
well as with the English curriculum, the igno d 
of reading, spelling, writing, listening, gram” 
etc. | 


Types and Sources of Data Used1* 
E  — ess OL Tata. Used: 


" he 
The following is a descriptive listing ko | 
types and sources of data which were obtai 


e in sep“ | 
over the entire forty-week period, starting in adi- 
tember 1955 and ending in June 1956. Some рег | 


tional data were obtained in August and Septen is . 1 
of 1956, and this will be the last item on this "c. | 
Because the data were gathered in an on - £9" е 
evolving ‘‘situational frame”, no attempt is 77 $ 
here to place the items chronologically in t € 

of “when” they were collected. 


ағ 

1. Descriptions of Individual Research P fe 
jects. Each teacher submitted a Progr eS” ty | 
port" on his research project during the th ee! 
Second week of the Project. In the fortieth ort 
each teacher contributed a Final Written REP ей, 
оп his project in which he described, anal У, t. 
and interpreted data gathered in his experim i a 
a) These reports were consolidated rc the 

Group Final Report and submitted e* 
Director of the Language Education 
partment. re^ 

b) Each member of the Study GrouP rov? 
ceived a hectographed copy of the 
Final Report. 

2. Oral Reports. Some members of the 
gave oral reports on their projects during оу, 
course of the Study. A discussion peri? { 
lowed each of the reporting sessions. per? 

3. Evaluation of the Project. Each mer ün 
the committee (group) was asked to respon рі“ 
writing) to the following: “What data, ideas; a) th? 
ions or impressions have you gained from I 
particular project you selected, and (b) ini 
fect have the Study and the conferences ha. 


up 
gro e 


оро? 
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your approach to your teaching problems?” 

a) This was done partially in the Progress 
Reports mentioned in item 1 inthis list. 

b) More evidence, although again only par- 
tial, was gotten in a more contr olled 
manner through administering an “Opin- 
ionaire"; this device was administered 
twice, the second time at the suggestion 
of the group. 

4. Statements of Purpose. During the fourth 
week of the project each member was asked to 
state, in writing, what he perceived his own 
purpose to be in participating іп the Study. These 
“¿Statements of Purpose" were compared to 
“Self-Evaluation” statements completed at the 
end of the study. 

5. Records of Group Meetings. A factual 
record (minutes) of each meeting was kept. Each 
set of minutes was analyzed and interpreted by 
this writer. 

6. Records of Conversations and Consulta- 
tions. Insofar as it was possible to be accurate 
and objective, several relevant talks between 
this writer and individual members of the group 
were described. 

a) We tried to develop here the “Key Peo- 
ple" concept and how it relates to friend- 
ship factors and informal communica- 
tion. 

b) Attempts were made here, also,to show 
(1) comparison of some members’ pri- 
vately expressed views on the project 
with those expressed at Group Meetings; 
(2) values of liaison between the coordi- 
nator and key members of the group 
outside of regular group meetings. 

7. Measures of Rise and Fall of Members' In- 
terest and Attitudes Toward the Project. “Епа- 
of Meeting Evaluation Slips" were given to the 
group members periodically. Each individual 
filled out such a slip. 

a) А “Consolidation Sheet" showing in 
minute detail all the responses made on 
the individual End-of-Meeting Slips was 
prepared by the coordinator and given 
to each group member. 

b) Both the individually completed slips 
and the Consolidation Sheet show rise 
and fall of interest as well as general 
attitude. 

c) The “Slips” gave us an index on each 
individual while the “Consolidation 
Sheet" showed a group (total) reaction. 

8. Data on Power Structure. Evidence was 
gathered indicating what the group members per- 
ceived to be the power structure of the total 
group. 

a) An **Opinionaire", a type of projective 
instrument, was preparedby this writer 
and administered to all members of the 


group. 
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b) The Opinionaire sought to discover spe- 
cifically ‘‘what the teacher-members per- 
ceived the roles of the principal, depart- 
ment head, supervisor, and the coordi- 
nator to be in this study. ”” 

c) The Opinionaire was given a second time 
to only the teacher-members attheir own 
suggestion; the results of the first and sec- 
ond res.onses will be compared. 

9. Evidence of Attitudinal Changes in Each of 
the Status People in the Group. This evidence is 
gathered from all of the sources mentioned above 
but treated separately in order that discrete state- 
ments may be made about each of the three ''status" 
people—the principal, the department head, and 
the supervisor. 

10. Self-Evaluation Statements. Some mem- 
bers of the group were invited to submit а descrip- 
tive statement in which they attempted to answer: 
‘(What did involvement in this project do for me 
personally?” The responses to this question will 
also be compared to the “Statement of Purpose" 
prepared at the beginning of the study. The Self- 
Evaluation statement was asked for auring the 
summer following the termination of the study. 


Anticipated Outcomes of the Study 


1. It was felt that the study would give us var- 
ious kinds of evidence regarding the practicality 
and efficiency of using the work-group-conference 
method to accomplish curriculum change in a field 
Situation over a relatively short period of time 
but under intensive application. 

2. Some anticipated, specific outcomes in the 
field of supervision were focused on questions 
Such as the following: 

a) Can teachers’ values and attitudes rela- 
tive to education be effectively changed 
through involvement in cooperative group 
effort in curriculum improvement? 

b) Will the changed values and attitudes be 
reflected in the curriculum? 

c) What are the elements in a group situa- 
tion that help people work together har- 
moniously? 

d) What conditions are conducive to stimula- 
ting people to be creative in a group pro- 
ject? 

e) What factors help or hinder communi c a- 
tion among people in a group? 

f) Are“key” people neededto initiate, devel- 
op, and maintain a group as it passes 
through the various stages of develop- 
ment? Ií so, who are they? What ident- 
ifies them? 

3. In regard to working on a departmental lev- 
el, some evidence should be of value to depart- 
ment heads and others interested in vorking to- 
ward change through the departmental unit. 

4. For teachers and others interested in at- 
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tempting further change of the curriculum inthe 
English or Language Arts area, this Study 
should give insights to the following: 

a) What aspects of the English curriculum 
do the teachers rate ‘‘most important’’? 

b) Are teachers’ differences more appar- 
ent than real in their view and practice 
of teaching English? 

c) Can “pilot groups" in one or more 
schools effectively influence curriculum 
patterns in other schools of the same 
system? 

The study succeeded, we believe, in making 
concrete the vague *'intangibles"' that comprise 
what is known as a "group". Communication 
between people was one of the common factors 
examined directly and indirectly. It would be 
well for the reader to remember at all times the 
importance the writer gave to the communication 
factor throughout the study. 


Limitations of the Study 


Generalizations cannot be made from this 
Study to any other population but that involvedin 
the project. This is a descriptive Study, one 
which employs a case-study approach. Thefour- 
teen teachers and administrators involved as 
wellasthe pupils with whom theteachers worked 
are the limits of the population to which general- 
izations can be applied. 

Because it did utilize the ca 


Se-study approach, 
however, the results c 


àn Serve as an indication 
of what could be expected of the work -group- 
conference metho , action research, etc., under 
reasonably similar circumstances, 

Another limitation arises from the fact that 
the coordinator-recorder and the teachers were 
obviously Subject to error in recording, trans- 


posing, documenting and interpreting data. This 
was constantly and continu 


ever, and all minutes and 


The Participants— Teachers 
atts teachers 


There were originally eleven teachers in the 
project, all teachers of English but some with 
Specialized jobs within the d epartment. Two 
taught “general language'' in addition to English, 
One taught journalism and acted as adviser tothe 
School newspaper. A fourth member had a «га 
dio workshop’’, a complete broadcasting studio 
which held regular daily classes. This teacher 
was also the building audio-visual chairman, A 
fifth teacher was actually a member of the Social 
Studies Depart ment but taught remedial reading 
in the English Department. Another type of dis- 
tinction among them was the grade levels taught. 
One particular teacher handled sev enth grade 


English classes only, while another taught exclu- 
sively ninth graders. | 
The age range of {һе members (ап all wh ite 
group) was from 22 to over 60 years. сеги 
experience varied from опе to more than 40 years: 


Participants— Leadership and Administration 


There were four people involved in the md 
who by their formally designated positions hà nals 
assume leadership and administrative PERME a 
bility for the total project. These people э 
the Supervisor of Language Educ ation, offic De- 
representative from Division of Instruct i ол, co" 
troit Public Schools; the writer who served ane 
ordinator and group recorder; the school pr е 
pal who was actually ex-officio chairman of . de" 
group and through whom most of the group ai 
cisions affecting curriculum changes and с nea 
room experiments had to be cleared; andthe ^, 
of the department in Urban School. The lati? - 
worked with the coordinator in forwarding C9 
munications to the group, making emergency 
justments. of meeting schedules, and in nN Pr 
available time or materials needed by teach t$ 
carrying out their individual research projet ugh 

The Director of Language Education, th pr? 
never deeply nor personally involved in the ne 
ject, approved it formally. His assisant, real | 
Supervisor of Language Education, was {е ghe | 
administrator-participant, however. It 25 5 
who helped plan, formulate, and direct the 
ject and set at least part of its major goalP ged | 

The coordinator's responsibilities inc ator’! 
initial planning with other group administr a 187 
arranging for meetings, finding clerical a 3, 
tance, keeping the group informed and can an 
and, finally, leading the group to reporti 
evaluation. 


The Purposes of the Study Group as TheY 
Were Perceived by Administrators of the 
Project —A Restatement 
<Toject—A Restatement 


Initially the administrators percei¥ e ach- 
goals of the group differently than did the poe 
érs. Below isa Specific restatement бЕр е 
of the group as seen by the coordinator, ? 
ог, principal, and department head. gi’ 

one 
1. To stimulate teachers, by largely б an 
rective means to organize, con ote” 
evaluate a curriculum. research P e ov? 
within their own department in the А 
school. in 


To develop in teachers abilities and il 
Sights related to: a) appreciation % jest 
izing research methods; b) applying», ang 
ing, and evaluating results of their ОЕ an e 
others’ research in the classroom 9' у 


€) values of cooperative effort to i? 
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the curriculum of their own department; 
d) perception of themselves and others 
from the view of their own values, skills, 
and attitudes as these influencethe curric- 
ular experiences of pupils. 

3. To develop leadership abilities of the teach- 
ers and to bring them closer to the reali- 
zation that leadership shifts in a group 
situation from one person to another. 

4, To motivate teachers to creativity both in 
the group and in the classroom. 

5. To emphasize that research ability is a de- 
velopmentally acquired skill and encour- 
age teachers to work at it. 


Purposes of the Group as Proposed for the 


Teachers by Administrators of the Group 


1. To improve the instructional program in 
English at Urban Junior High School. 

2. To try new methods, materials, and tech- 
niques in the classrooms. 

3. To understand better existing methods, 
practices, materials, and techniques. 

4. To contribute, ultimately, the results of 
the group's work to a new Curriculum 
Guide in English for Junior High Schools. 


Procedure 


The project was organized around two focal 
points, in terms of its operation: 1) regular 
group meetings held after school at Urban Jun- 
ior High every second week for two consecutive 
semesters, and 2) the carrying out of instruc- 
tional change— experiments, tests, re-examin- 
ation of established methods and materials—by 
the teachers with their pupils in the regular 
classroom situations. The biweekly, meetings 
were aimed at planning, discussion, and evalua- 
tion leading always to application or modifica- 
tion of the classroom research being done by the 

s. 
ES in the project it was decided by the 
group that the coordinator would be doing a mu- 
tual service for himself and the group by acting 
as recorder. The notes or minutes of each 
meeting, as well as other material needed by 
the group, Were then hectographedat the home 
of one of the group members. This person hec- 
tographed almost all the materials which grew 
out of the group project. 

The coordinator was present at every meet- 
ing (eighteen in all) of the group. The principal, 
department head, and the supervisor were not 
always present, and when they were did not al- 
ways stay for the entire meeting. This was es- 
pecially true of the principal who felt, foratime, 
that his voluntary withdrawal from some meet- 
ings might ‘‘free’’ the teachers froma restraint 
often existent when a status personis in the group. 


Periodically, end-of-meeting evaluation slips 
were filled out by teachers and other types of 
data collected from them. Whenever such data 
were requested it was made explicit that the data 
would be used both for feedback to the group and 
for the writer's dissertation. 


SEC TION HI 
Interpretation of Data2 


THE FOCUS of the data-gathering instruments 
was on tracing, (1) developmental growth of the 
participants in skills and insights related to inter- 
personal relations, (2) research abilities and ap- 
preciations, (3) changes in self-perception 
(4) awareness and understanding of the group’s 
power structure, (5) evaluation skills, (6) com- 
munication patterns, and (7) appraisal of methods 
and materials in the English curriculum of Urban 
Junior High School. 

The interpretation of the data was done by var- 
ious means. Very little of the data could be 
quantified or measured by existent mathematical 
and statistical methods. For example, resis- 
tance to an idea or toaperson might be expressed 
in many ways: facial expressions, verbal re- 
sponse or withdrawal, bodily movement, degrees 
of hostility or enthusiasm, etc. For these kinds 
of data, direct observation and subsequent inter- 
pretation by the coordinator were utilized. 

The Anecdotal Records of Group Meet ings 
were interpreted by presenting a verbatim ac- 
count of each meeting and analyzing the state- 
ments made against the background of the total 
context of that meeting, the total project, the 
participants themselves, and the ‘‘natural social 
group” factors in a field situation. Also, these 
meetings included and were influenced by activi- 
ties of the participants: discussion, planning, pre- 
senting reports, giving research findings, etc. 
What was said in the meetings was frequently com- 
pared to what was actually done. 3 

Comparison, then, was a useful tool in the in- 
terpretation of data. Various kinds of data were 
compared. Some examples of these include the 
following: 


1. Comparison of each member's “Statement 
of Personal-Professional Purpose for Participa- 
tion in the Project” (written earlyin the s tudy) 
with “Description of Individual Research Pro- 
jects” and “‘Self-Evaluation Statements" (the lat- 
ter two statements made at the end of the study). 

2. As evaluation of the project was constant 
and continuous bimonthly sources of data like 
'**End-of-Meeting Evaluation Slips’’ and End-of- 
Meeting Consolidation Sheets” were com pared. 
These were further compared to each member's 
behavior in the group and to his reported re- 
search and teaching activities between meetings. 
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3. Oral and written reports by individual 


pothesis follow each one of them respectively. 
S were compared to their original “Prob- ; | difca- 
i s. and Ni inest of Purpose", Hypothesis A: There will be tangible nes chers 
4. Each member of the group was asked to tions in classroom practices of the 

answer, in writing, the following questions after involved in the study. 
irst half of the Study was completed: { the 
ман Evidence Relating to Hypothesis A— s 
“What data, ideas, opinions or impres- greater part of the study there were pue in chis 
Sions have you gained from a) the particu- plus a teaching department head i rdc n) of 
lar project you Selected, and b) what effect Study. Based on reports (oral and wri i 

have the study and the conferences had up- 


г tangi" 
each of these participants, the evidence of 
on your approach to your teac hing prob- 


ТРИ уеп 
ble modifications is clear-cutand депп тене 
lems?” of the ten participants’ classrooms. nt pe the 
of the eighth member, changes wereless s of the 
The responses to these questions were alsocom- Writer had expected while in the S dd few 
pared to items 1-3 above, ninth and tenth members Changes were v 
Or none. 
5. Data on the group's pow 


fol- 
er Structure were Some examples of modifications were the 
obtained by an “Оріпіопаіге””, а projective lowing: 

type instrument. The first time the members dies core 
ere askedto re- 1. The initiation of English. social studi 

the group”, Ey. 


he 
classes In the classrooms of two of t 
ey had not answered as a “typi- members. ip pro" 
cal” member but in reality had Projected their 2. A Systematized, extensive penmanshi? ors: 
own personal opinions. At the request of the ject in the Classroom of one of the Dating chil" 
members the instrument was presented а second 3. Re-evaluation of the purposes of testi 
time, two Weeks later. On this occasion each dren on the part of a member. ies oF 
member requesteq to be allowed to answer with 4. Utilization of personality invento rie 
his own Opinion, not that of a hypothetical ««typj problem checklists. evalu- 
cal’ member. After completion of these Opinion- 5. Planning developing, recording and 
aires (using the Same items), comparison was ating a “new” spelling program. hen 
made to the first responses. 6. Examination of pupils’ reading compri 

" a at "em Sion abilities and subsequent adjust™ 

ere were 36 questions in each Opinionaire, the reading Program ent 

Each of the nine teacher members c ompleted hs Re-organization of classroom manage cy 
each opinionaire fully on both occasions, Inonly Procedures leading to greater effici? 
nine instances or a total of 648 responses we re and mor s tes 
there any absolute changes in response à 
of these differen 


"teaching time”. 
cant at the five 


sonal method? 
Hypothesis B: A Variety of instruction 
Interpretation of data Was, in summary, ac- will be tried and tested. 

, 
ation, non ifi 
content analysis of Written materi: T 


ion 
: adit 
S, compari Evidence Relating to Hypothesis o don ye 
sons among a variety of writes d^ a s B to the’ evidence cited under Hypothesis i 
tween written material and y 


so 
ih that Hypothesis B was а150› 
e main Supported. 
ellas d 
meetings. Further, anal sis of between 


written x 
al data was compared to data on andverb 


2 


atu 
O 2 
i 1. The department head worked with tW} an 
Emotional responses, attitudes vs values 18". dent teachers anq developed the unit P Є 
due aes peser. cee а highly inter- 2. Ce eee ber tried different met 
ized for patterns ds of t mber trie two Tues d 
particular р тои аз well a S die isolation of 3 ў third ce set: a meth? 
of improving use of the dictionary. iple? 
SEC TION IV 4. Still another applied “phonics” ргіпс od 
to reading ang writing skills and com^ jr- 
Findings of the Study these to a method where phonics were 
tually unmentioned to the pupils. arbi 
EARLIER IN this report, five major hypoth- 5. Another member switched from the to 2 
eses were listed. These hypotheses are restat- trary teaching of formal grammar anal^ 
ed here. The results or the testing of each hy- method base, 


Оп greater “individual апе 


+35 8 
ysis” and a way of teaching the “то 


|^ 
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necessary skills”. 


Hypothesis C: There will be an increase in the 
confidence of teachers in defining problems. 


Evidence Relating to Hypothesis C— There 
Was à gradual and, at times, almost impercep- 
tible increase in confidence of teachers in defin- 
ing problems. For many weeks and months the 
teachers kept examining methods, materials, 
techniques, classroom load and other problems 
external to them. From this they moved to de- 
fining problems in the children—their growth, 
bersonality, learning and cultural problems. 
e the teachers began facing problems re- 
of th to teacher-administrator relations. Some 
the participants actually got to a point where 
rl vere examining themselves, analyzingtheir 

n motives, values and attitudes. 


Hypothesis D; Teachers will feel inc reasingly 


s rei i ° E 
Е іп exchanging suggestions with each 
as 


ш ше Relating to Hypothesis D—One has 
he P erp the Minutes of Group Meetings or 
he Ma made in the Progress Report and 
Crease or Repone of the group to trace the in- 
and Candidness among the members. M7 
More fee well as M3 and M9 stated explicitly on 
teachers n one occasion that knowing the other 
themsely Were facing the same problems as they 
Tee to ad. were facing made them feel more 
ins for help and suggestions. 
Phases " 8roup meetings, especially during. 
tate to ss 386 апа Four, members did not hesi- 
used this gh Why don't you try this?” or “Гуе 
Conditions echnique and it worked under some 
‚ Also, as ti y don’t you give it a try?” 
visor ang time went on, the principal, super- 
techniques oon dinator felt more free to suggest 
ecause of methods. The department head, 
b S, was er closeness to the teacher - mem - 
rom the тае to offer suggestions tothe group 
Ponse to а beginning. The pattern of re- 
berg became, Suggestions changed as the mem- 
° more “group” oriented. 


The Head of the English Depart- 
аве teachers Principal will allow and encour- 
Methods ers to try, test, and develop newer 
E > techniques, and courses. 
of , videne А 
eag Очр meena НЕ to Hypothesis E— Minutes 
Sis” “Nourag gs Show that the principal repeat- 
"A eq teachers as stated in Hy pothe- 
Was du И 
lum, Ater Work ES Support that the department 
made” The Prin ed Core into the English curricu- 
by cipal Supported most suggestions 


em 
ers of the group and helpedthe 


members try different methods. The writer be- 
lieves that the principal’s contributions were the 
most beneficial ones coming from an administra- 
tor involved in the project and his positive atti- 
tude did contribute a great deal to the life and 
value of the study. 

The evidence for Hypothesis E seems conclu- 
sive to the writer as far as the principal is con- 
cerned. The evidence, on the other hand, to show 
that the department head “allowed and encour- 
aged teachers to try, test, and develop newer 
methods, techniques, and courses" is inconclu- 
sive at this time. 

There is one thing, however, which this study 
did demonstrate most pointedly. It is related to 
the role of the department head (inthis study) but, 
more specifically, to members’ use of research 
materials. 

Only twice during this entire study did a mem- 
ber actually utilize the research summaries (find- 
ings of experts, resource materials, or reports 
of research in progress) that were brought in to 
the group by the coordinator. In each of the two 
instances that such material was used it was by 
the same member. On both occasions the mater- 
ial related to penmanship. On the other hand, 
never during the entire forty-week period did any 
member give indication that he ever utilized the 
hectographed reprints of research findings (as 
prepared by the coordinator and the group's sec- 
retary. As far as the evidence of this study 
shows, such research materials, made easily 
and conveniently available to all members of the 
group, did not affect or modify the teaching prac 
tices of any group member. 

Because the department head was strongly 
“for” the practice of inviting experts in to ad- 
dress the group, it seems significant to the writ- 
er that the department head as well as the other 
group members never took advantage of research 
summaries and articles in printedform. It might 
well be that the opinions or findings of experts as 
presented to the group in this study were actually 
as ineffectual as the written research materials 
which the members were given. Again, this kind 
of teacher-reaction seems tobear out the two 
major assumptions made at the outset of this 


study. 


Three Questions Related to the Five 
Major Hypotheses 


1. What are the strong points of the action 
research method and the work-group-con- 


ference technique? 


In the Urban study the most outstanding contri- 
bution of the action research method and employ- 
ment of the work-group-conference technique was 
that classroom instruction was improved. The 
improvement varied from teacher to teacher 
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over the thirty-eight week period but all of the 
teachers stated that the project had afforded 
them opportunities to improve classroom instruc- 
tion. Some examples of teachers' Statements 
are presented below: 


2) **My opportunity to teach core cameasa 
direct result of our meetings, ” 

b) **My study was on Remedial Penmanshi p. 
It will culminate in a new Handwriting 
Scale. It has solved for me the question 
‘How can I improve the penmanship of my 
pupils?’ and has answered questions of 
many years’ standing. ” 

c) “I have found that many of the ideas and 
methods which I have used for the la st 
eight years are basically sound. The con- 
ferences have motivated me, vexed me, 
and defeated my tendency toward laziness 
in educational theory, My obsessionof be- 
ing a scholar has given way toone of being 
an outstanding teacher. ” 

d) “I have gleaned many techniques (of in- 
struction) in the past year from our discus- 
sions that probably would have taken years 
to discover by myself, if ever... I was be- 
ginning to think that my own teaching situ- 
ation consisted of the four walls of my 
classroom but the discussions caused me 
to realize more forcefully that education 
is a process of the whole School. " 

е) **...I incorporated ideas heard at meetings 
in my teac hing...the conferences de- 
veloped a liberal attitude within me to ex- 
periment and find better teaching methods; 
this makes a better teacher.” 


Besides the explicit statements of the teach- 
ers regarding improved teaching in their class- 
rooms the Progress Reports, Minutes of Group 
Meetings, Consolidation of End-of-Meeting Eval- 


» and the Final Report all show that 
teachers were think 


classroom instruction. 


2. Under what conditions can these шып 
best be used in creating curriculum change? 


In the original report of this study, it wes 
pointed out that certain conditions are necessary 
for 1) agroup to develop and operate harmonious- 
ly, and 2) people in a group to be stimulate jt 
ward creative participation. Good channels E nc 
methods of communication are discussedinthe 
report and also the importance of key people. In 
the Urban study the single most important condi- 
tion which was necessary for the success of the 
project was that of ‘‘support’’. Assuming that 
the curriculum of a given junior high school can 
profit from the concerted effort of a number of 
teachers working at it cooperatively, the evidence 


of the Urban study indicates that is of utmost im- 
portance that: 


a) The department head (in a departmentalized 
junior high school) is fully and completely 
in accord with the idea of trying to do such 
a project. This kind of project will, in all 
likelihood, not succeed unless this Support 
is constant. 

b) The principal must not merely “allow” it 
but be active in lending it support by giving 
it the added aura of his prestige and active- 
ly serving it by backing teachers' decisions 
relative to curricular improvement. Hecan, 
furthermore, be of greater service by par- 
ticipating in group sessions when his pres- 
ence will be a positive force and his contri- 
butions (valuable due to experience and spe- 
cial knowledge) will enhance the work of 
the group. 

с) The supervisor, like the principal and de- 
partment head must accept and support the 
idea of action research. In order of prior- 
ity of value to the project (influence, status: 
decision-making), the writer feels the rank- 
ing is department head, principal and, last- 
ly, supervisor. 

d) Enough time must be allowed. This means 
that some teacher-release time should P 
made available and also that the total length 
of the project be allotted a time perioa t 
propriate to the growth and development s 
the group of participants. То ‘‘cut off a 
Project before the group has completed i 
whole job might mean the negation of Е 
that has ропе before—the work, time, € 
fort, and even, in some cases, isolate’ 
“islands of success” in the “im proving 
curriculum as well as human relationwis? 
A premature forced stop might well be ОП 
of the worst things that could happen. 


d 
- What are the limitations of the methods тй 
application of action researchandthe wor 
group-conference technique? 
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The U 7 
PA S о strongly indicated that initial 
Cipal and depa support of the project by the prin- 
Out their i a head was essential. With- 
begun. We; k support the study could not have 
ally E ana snags in the study usu- 
result of a de giier directly or indirectly as à 
er. On tho wee of support by either mem- 
Was forthco other hand, when strong support 
the group Mee the morale and production of 
research Bay eased. The limitations of action 
nique in thi the work-group-conference tech- 
equate sup s study came about through lack of ad- 
Itself in са This lack оѓ support manifested 
limitations eral ways and these comprised the 
of the methods and their application. 


a) The | 
Кары О receivea no financ ial 
English » er from the school fund, the 
tem-wi epartment fund, or from the 5у5- 
ацы treasury. 
meetings bhi edd no release-time for 
done on of the group. Everything was 

€) The coo their own time after school. 
unteer a eae and group secretary (a vol- 
utes, зем me job) did all recording of min- 
ing. In R igi typing, andhectograph- 
reprinted ition, many letters, brochures, 
correspo герге. materials, and other 
Беа Ре related to ће group were 

Ordinat ectographed, and mailed by the СО- 

9) Althou is and group secretary. 

e ros e de is undisputable evidence that 
continuing ity of the group was in favor 0 
Pps his faces for a third seme st €T» 
Pportunity was lost. 


Anoth 

ener s 

es done in ШИША Он of action r 

earch Че his study is that no concis 

ro hitiati dlongin paire 
n evo ves 


esearchas it 
e re- 


(9) 

initia je UE project. The desig 
det, Инан g project. This is 
ч ig ment th ion but it need not be 
ori? apr oughout the life-span of a projec t. 
in it method a “developmental” characte 
bec Proceed j If the group and the individuals 
Cat ME increase a typical fashion, designs shou” 
We mi А вау sharper and more sophisti- 
“teg eht cha. an some teachers moved from what 
Doi. Search c racterize as the lowest points on a 
nts on Ompetence’’ scale to relatively high 
е scale in less than forty weeks. 


© 
зеш, 


The 
Stu А 
dy at Urban Junior High in petroit 


4T 


was an honest attempt to test, in a fi si 

tion, the effectiveness of the bc 
ference method as a technique in curricul un. 
change and improvement. Over a forty-week 
period the writer saw the teachers' values atti- 
tudes and skills relative to education c hange 
enough to permit acceptance and development of a 
democratic, research-oriented way of doing 


things. 
It would be unrealistic to Say that ‘‘great’’ cur- 
urred because of the project, 


ricular changes occ 
but, on the other hand, it was never Our expecta- 
tion that changes should or would be of great mag- 


nitude. The terminal point of our study should 
have been, We believe, the *new" point of de- 
parture for the teacher group in further explora- 
tion of the curriculum and themselves. The in- 
sights and skills acquired by each teacher, how- 
ever, during the study may benefit some future 
group in advancing research and coopera tive 


action. 
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FOOTNOTES 


Examples of all the data-gatherin g instru- 
ments are shown in the Appendix of the origi- 
nal dissertation, 


Within the limits of this report it is possible 
to give only brief samples of how data were 
interpreted. For complete examination of da- 
ta analysis see the original disser tation, 
Chapters IV-VII and Appendix, pp. 351-451, 


3. There is much evidence in the group’s 75 
page Final Report to indicate that all mem- 
bers moved definitely to the action level. An 
abstract of the Final Report is in the Appen- 
dix of the original dissertation. 


4. See Chapter VII, pp. 282-95, for a.com- 


plete analysis and Appendix of disserta- 
tion. 
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LITERAL AND CRITICAL READING 
IN SOCIAL STUDIES" 


E. ELONA SOCHOR 
Temple University 


The Problem and Its Scope 


THE PU 
R , 
E certain ined of this study was to investi- 
ong fim D of reading comprehension 
we Problem pm pupils. In order to explore 
idate an 385 was necessary to construct апі 
ermediate-grade reading testin 


Social 

studi 

т le 

ed were. S. The specific problems consid- 


What i 
i 15 the š 
igence ang relationship between verbal intel- 
a. “ 
v MG na)? 
s Fin ' reading ability? 
ement in literal reading compre- 


hensi 

Sion і i 

Suites in social studies? 

Ао s: in critical reading com- 
ion in social studies? 


с. 


What ; 
tig 
€ relationship between “general” 


readi 

ае ability and 

à ahit 
8001 ability to comprehend 1 
Se Studies ? 

mA to comprehend cri 

3. al studies? 
What i 
in lite 
Studie 


iterally in 


tically in 


n proficiency 


S th 
е relationship betwee 
tion of social 


ral and critj 
s? d critical interpreta 


` Wh. 
. nat i 
in S th 
this Geh Mie ee between pro 
abil ed critical reading skill and 
nso cial 


Stu ity t 
diego” ° comprehend literally 


Jus т 
tifi 
lcation of the S 
e Study 


T 

апа o date 
, th 

n Slopment of red of the m ea surement 

ub Ong reading comprehension as 
i els are still widely evi a 
tup Shteg у apprais: Reading tests, largely lim- 
ing? aime mas *tsense- meaning" and 
сті пець Se a уы field of litera- 
monly to determine all read- 


t 
bra Cal Littl 
abictice, aging s Attention is being given 
ity 53 Tea ills in study situation? In 
«unitary 


e» di 
*a ng tends to remain 2 
бз, 
act 
С. 


reading ability іѕпо 1 
tenable. Conclusive data from studies by eras 
(11), Tyler (32), Thorndike (30), and Davis (10) 


indicate that adequa 


Although thi 


esearch sti 
ific aspects of reading c om- 


functioning. 
lished, much r 
ted on the more spec 
prehension. 
Moreover, the skills and ab 
istic of effective reading interpretation are not 
the same in all content areas. The specificity 
of skills within sub, r areas at the sec- 
n substantiated 

Further data 
these skills in each 
lementary 


ilities characte r- 


content area; 
school level. 

Another m 
today is ver 


been placed 
tion. Retardation in readi 


mined too frequently in terms o. 
As early as 1921, 


blem in education 
h emphasis has 
f interpreta- 


e majo 
i tive reading com- 
ts with the schools. Effec € 
ы ion must be emphasized in al 
ng comprehension is 
ru . The ability to inter- 
te P S 
set) on current events is vital to the 
. In such a socialor- 
ing W is stated directly is a 
Toi i i i how- 
uisite- ere literal interpretation, 
pow not sufficient. The citizen m ust be 


e of readi 
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skilled in evaluating critically the wealth of 
available printed materials. 


Limitations of the Study 


Experimental Design— The purpose of this in- 
vestigation was to study the relationsh ips be- 
tween intelligence and three types of readi ng 
ability: “general” reading, literalinte rpreta- 
tion, and critical interpretation. Final data 
were obtained on a representative sampleof five 
hundred thirteen fifth-grade pupils. To obtain 
these data on reading skills in social Studies, it 
was necessary to construct and validate the ex- 
perimental edition of a reading test in that con- 
tent area. A group test of verbal intelligence 
and a standardized reading test were used to help 
in estimating the normality of the distribution. 

Statistical Design— The results were a na- 
lyzed with large sample techniques which includ- 
ed the Chi-square test 
of relationship, 
point-biserial m 
ing test in socia 
of one technique 
three techniques 


2. Age: The chronological age range was from 
10-0 through 14-6, 

3. Sex: Boys and girls were tested. 

4. д White апа Negro children Were includ- 
ed. 


‚ Intelligence: Verbal intelligence quotients 


158. 


g grade ranged 
| from minus 2.5 through 10.3. 


e two Criterion 
and reading grade. 


Form I, Leve] of 


ed by Teachers 
), was adminis- 


The Pintner General 
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Ability Test, Form A (published by World Laus 
Company), was used as a measure of verbal i! 
telligence. 


Definition of Terms 
= O P PS. 


The following terminology is basic to eum 
Study. For purposes of clarity, literal pae 
comprehension and the selected critical nee ‘ell 
comprehension skills will be illustrated as Cr 
as defined. In each example, the correc per 
sponse for the test item will be the first, an 
of the distractors will be the second. ES о 

“Literal Reading'' represents the ability Ка 
obtain a low-level type of interpretation by ма 
only the information explicitly stated. For s сагв 
ple, the selection states, ‘‘Millions of wor та 
dragged stone blocks for the outside walls а E 
packed basket after basket of earth between E d 
The test item appraising literal interpretato" 
this sentence is: ‘‘The outside walls were m? 
of (1) stone, (2) earth. . .” ЕСТ, 

“Critical Reading” represents the ability ^ 
obtain a level of interpretation higher than oa 
needed for literal interpretation. In this 5 


>jon 
the following critical reading comprehensi 
skills were set up: 


'8 
1. Functional Vocabulary tests the r еадег . 
background of experience in reference 
concept used in the selection. sts the 
2. Semantic Variation of Vocabulary De sag 
reader’s ability to identify a xat ех“ 
of а given word from the selection. in this 
ample, the word ‘‘beat’’ is employed! 


€ 
Јах 
manner іп а story: “Every дау EE test 
drivers beat these workers. .." T wor 


item is: ‘The sentence which uses the tory 
beat just as it is used inline 26 of Ше it is 
is (1) Mother said, ‘Beat the rug unti sev 
clean.’ (2) The policeman’s beat was 
eral miles long. . . # satin 
3. Central Theme tests the ability to ат то?! 
guish the central topic of the select пів 
Subordinate ones. An example is: est 
Story as a whole is about (1) the eim rof 
wall in the world, (2) the early empe 
China... secte kay" 
4. Key Idea tests the ability to indentify te pe 
or most important, idea in the story’ “jn 
test item is: **The most important ! yack, 
the story is that the Great Wall wae ay: ' 
ing like an army, (2) used as a niga cific 
5. Inference tests the ability to draw a oF ral 
conclusion indirectly from the Ime o 
given. For example, the first seleni nd 
discusses the need for the Great Wal 0 
then states, “It was longer than 1 md 
miles, more than half the distance a°} fpe 
our own country.” The test item 15 yal! 
Emperor of China needed the Great 


L. 
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because China (1) was too large to protect 
with soldiers, (2) had only a few soldiers 
$ БОР rode on horseback. . ."' 

. Generalization tests the ability to identify а 
general conclusion or principle indirectly 
from information implicitly stated. An ex- 
ample is: ‘‘From the story we should be- 
lieve that ALL (1) buildings thatlast have 
been built carefully, (2) workers of China 
are better than the workers in our country 

7. Problem Solving tests the ability to apply 
information from the selection to a prob- 
lematic situation. One test item is: “Mrs. 
Brown paid twenty-five cents for a can of 
peaches. She said, “This is how the farm- 
er gets rich.’ She was wrong because (1) 
the farmer gets only a part of the twenty- 
five cents, (2) farmers get rich from dairy 

8 products. . ." 

- Association of Ideas tests the ability to see 
the relationship among ideas inaseries. 
For example, ‘‘The row with ideas from 
the story that belong together is (1) fierce, 
cruel, savage; (2) enemy, builders, horses, 


» 


9. Analogy tests the ability to perceive rela- 
tionship between two pairs of ideas. The 
established rela- 


idea which completes an 
tionship is identified: «Stones are to build- 


ing as people are to (1) nation, (2) houses 


10. Antecedent tests the ability to recognize the 
word or words to which a selected pronoun 
refers. For example, “The word them 
in line 25 of the story refers to (1) outside 

s walls, (2) people of China. Ç - Ç. 
equence tests the ability to determine a 
time sequence. One test item: “Below is 
a story about how certain vegetables reach 
the store. The first idea out of order is 
(1) The vegetables are canned, (2) the veg- 

E etables are processed. . . ” 
xtraneous Idea tests the ability to deter- 
mine relevancy of ideas to a particular 5€- 
lection. For example, 
found in the story is tha 
were buried in the Great Wall, 

uae used the wall for protection. : ^ .. 
uthor Purpose tests the ability to identify 
the author's primary motive in writing а 

given selection. One test item is: “The 
годе wrote this story because he thinks 

S. Should know (1) about great thing? in 

C ler countries, (2) about the enemies of 
hina. . ME 


li. 


12, 


13, 


ec 
Sur : : 
vey Reading Comprehension" iS ameas- 


Чге of 

readin understanding based on the results of а 

ielq Ed 2 which uses content largely from the 
iterature. The comprehension section 


of the Gates Reading Survey was used in this 
study. ‘‘General’’ Reading Comprehension is 
used synonymously with “survey” reading com- 
prehension. ` 
‘Verbal Intelligence" is a measure of capac- 
ity which is obtained from a test that usually re- 
quires a high degree of language facility both in 
understanding directions and in the subject's re- 


sponses. 


A Review of Kindred Literature 


Although most of the research on reading com- 
prehension and test construction has been conduct- 
ed at the secondary or college levels, investiga- 
tions at the elementary level tend to confirm the 
conclusions indicated in the research at the high- 
er levels. Ас cordingly, the pertinent con- 
clusions from all the studies are summar- 
ized in terms of two major areas: critical 
reading comprehension andtest con- 


struction. 


Critical Reading Comprehension 


In 1917 Thorndike published three a rticles 
emphasizing the premise that reading is a think- 
ing process (29, 30, 31). Since that time, educa- 
tors have been concerned not only with the ‘‘sense- 
? or literal comprehension of printed 


meaning, 
material (14, 34), but also with a more thorough 
interpretation, or critical comprehension. Crit- 


hension has been defined as 
critical thinking in reading situations (4). 
Critical Thinking. — Since critical thinking is 
basic to critical reading comprehension, a sum- 
mary of the conclusions in the research on crit- 
ical thinking is pertinent to this investigation: 

1. Critical thinking necessitates the function- 
ing of higher level thought processes (3,30). 

2. Critical thinking appears to be a complex 
of component abilities, some of whichseem 
to have been identified (3, 12, 13, 32). 

3. The manifestation of intelligence does not 
guarantee the ability to think critically (3, 
13). 

4. The ability to think critically in one content 
area cannot be assumed to indicate that the 
same is true in another (25). 

5. Aspects of critical thinking can be meas- 
ured by paper-and-pencil tests (13,32,33). 

6. Certain aspects of critical thinking can be 
improved by instruction (13, 26, 33). 

1. Fifth-grade children can think critically. 
Moreover, the difference between their 
ability to reason and that of adults is mere- 
lya quantitative one (8, 16, 21). 


ical reading compre 


Critical Thinking in Reading Situations— T he 


research on critical reading comprehension re- 


—— 
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veals: 


1. Critical reading comprehension has the same 
attributes as those stated above for critical 
thinking, but they apply when critical thinking 
is done in reading situations (4,10,11,12, 13, 
26, 30, 31). 

2. Literal reading comprehension appears to ne- 
cessitate mental functioning of a lower level 
than critical reading comprehension (3, 11,14, 
32, 34). 

3. The ability to comprehend critically cannot be 
predicted from the ability to comprehend liter- 
ally, or factually (3, 11, 12, 25, 32). 


Test Construction 
I onstruction 


The need for better test measures at the ele- 
mentary level has been stressed repeatedly inthe 
literature (4,7,19). The following list of charac- 
teristics includes the major suggestions from per- 
tinent literature. 

Readability—In constructing a test, the author 
should consider the two aspects of readability (4, 
9, 22): 

, 1. The reader - his experience, interests and 
language facility, 

2. The material - the interest level, the lan- 


guage, the concepts, and the mechanical 
features. 


Reading in the Content Areas— 
reading has significant implicatio 
struction: 

1. Since reading skills vary between content 
areas and success in reading the materials 
of one content area cannot be used as a cri- 
terion of success in another content area, 
test materials should be built from mater- 


This aspect of 
ns for test con- 


Mechanical Features— The followi iteri 
are suggested in the lit T eruca 


eratu: uc- 
: ( ) re for test co nstr 
1. The e 
test mater ials Should be valid and reli 
2. 


be clear and consi 
ent for each administra pois: 


tion of the t 
4. Each multiple-choice item should We C. a) 
at least five alternate ' 


responses, (2) one 


best answer, (3) the correct answer ran- 
domized, and (4) plausible distractors of 
about equal length. 


Summary of Procedure 


The following procedure was used in this 
study: 


l. A preliminary edition of The Intermediate 
Reading Test: Social Studies, designed to ap- 
praise both literal interpretation and specif- 
ic critical reading comprehension skills, was 
constructed and validated. | 

2. A preliminary Study was conducted in шеп, 
the test was administered to one hundredan 
forty-three children in grades four, five, 
and six. "The results were used to evaluate 
the preliminary edition in terms of readabil- 
ity and the discriminating power and internal 
consistency of each test item. 

3. The measure was revised and called the ex- 
perimental edition of The Intermediate Read- 
ing Test: Social Studies. 

4. The experimental edition of the reading test 
was administered to five hundred and thirteen 
children not included in the preliminary 
study. 

a. The reliability of the experimental edition 
was computed by means of the Kuder- 
Richardson Estimate of Test Reliability. 

b. The validity of each test item was evaluat- 
ed by using (1) the Standard Error ofthe 


Difference Between Proportions, (2) an €57 


timate of the product-moment coefficient 
of correlation based on the upper and low" 
er 27% of the distribution, and (3) inspec 
tion of the total number of choices for eac 
distractor. to 
5. Two standardized tests were administered e 
the population used in the major study: үне 
Gates Reading Survey (Level of Comprehen 
sion) to appraise “general” reading EM 
and The Pintner General Ability Test Map 
al Series) to obtain verbal intelligence qu 
tients. ion 
6. The product-moment method of correlati? 
was used to estimate the degree of relai adi 
Ship between the four variables: intellige гё" 
"general" reading ability, literal с pgs pe 
hension in Social studies, and critical in 
pretation in social studies. the 
- Partial correlation was used to estimate 
degree of relationship between the ap 
types of reading ability (‘‘general’’ rea stud" 
ability, literal comprehension in ВОСІ aid” 
ies, and critical interpretation in soci 
les) when intelligence was partialled ой. es 
8. Chi-square was used to determine the Pi it- 
ence or absence of relationship between kill 
eral reading and each critical reading 8 
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| P ш social studies. 
- The poi ape 
LM UR am ti method of correlation was 
L Бейкеел Tito pee the degree of relationship 
ing skill į ral reading and each critical read- 
ill in social studies. 


Summary of Results 


Problem Я 
о агае on of relationship betw een 
eral reading c general” reading ability, lit- 
ing [ei comprehension, and critical read- 
ed by the pr ation in social studies, as estimat- 
tion, was ee method of cor rela- 
respectively. +.01, .72+.02, and . 69 + .02 


Problem 

ам, degree of relationship between 

WAY sent ty and literal and crit- 

| as estimat g comprehension in social studies, 
was .76 + ч by the product-moment form ul a, 

en intelli 2 and .64 + .03 respectively. 
Seld осе жаз partialled out, the сог- 
tively, Oefficients were .42 and . 17 respec- 


nship between 
tation in 50- 
duct-m 0- 
intelli- 


Proble 
inane А The degree of relatio 
Sial studi critical reading interpre 
ment унан as estimated ру ће pro 
Bence ези Was . 61 deae With 
ntrolled, the correlation Was · 23. 


Proble 

m IV: 

Success ° The degree of relationship between 

total litern OH critical reading skill an the 

ir methods PE Score was comp uted by 
9 cos enty-three point-biserial coefficients 

“m Ga ranged from . 7 with 
edian” coefficient of . 


Skill w; 
Ш with the greatest degree of T 
ary’? (.28), the 


1р б: 
Skill Sith functional vocabul he 

b (С 06), the least was “extraneous idea 
nt coef- 


The esti 
шеп ee of the product-mome 
e 2:27 $us correlation ranged from ° 61 to 
` Twenty-¢ a “median” coefficient 0 .25. 
items ‘ay Wo of the critical reading test 
of thre P Pared to be signiticanton the basis 
е probability values. 


Conclusions 
e limitations of this study 25 stated 
ali fee original thesis on file at Temple 
1d: © following conclusions appear 
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Problem I 


1. Verbal intelligence appears to be very hi 
= $s Е J y highl 
r^ „а to “general” reading ability (.83 + y 
2. Verbal intelligence appears to be su i 
ly related to the ability to r үт 
ally in social studies (.72 + .02). 
3. Verbal intelligence appears to be substantial- 
ly related to the ability to comprehend criti- 
cally in social studies (. 69 + . 02). 


Problem H 


reading ability appears to be high- 

literal reading interpretation of 
social studies materials (.76 4.02). When 
intelligence is partialled out, the relationship 
appears to be substantial (. 41 + . 04). 

5. *General" reading ability appears tobe sub- 
stantially related to critical reading interpre- 

tation in social studies (. 64 4.03). When in- 

telligence is held constant, the relations hip 

appears to be low (.17 + . 04). 


4, “General” 
ly related to 


Problem Ш 


6. Literal reading comprehension appears to be 
ed to critical reading com- 

jal studies (. 61 +. 03). With 

stant, the relationship ap- 


gligible (. 23 +. 04). 


Problem IV 


ical reading skill appears 
e or low relationship to the 
ally in social stud- 


ed crit 
ligibl 
rehend liter 


7, Each select 
to show a neg 
ability to comp 


ies. 


General Conclusions 

1. Reading comprehension in social studies ap- 
ears to be 2 composite of many skills and 
abilities which apparently function at various 

levels of mental activity. 

2. Literal and critical reading comprehension 

in social studies appear to be relatively inde- 


pendent abilities when intelligence is held 


ading compr ehension 
be relatively independent of 
hend literally in soc ial 


nce is held constant, critical 
ehension in social studies ap- 
ally independent of *gen- 
literal reading com- 
ndependent of ‘‘g en- 


studies. 

4. When intellige 
reading compr 

e virtu 

ding ability; 
.relatively i 


prehension, 
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eral" reading ability. 

9. Group tests of general" reading ability and 
group tests of verbal intelligence tend to 
measure common factors. 


Implications 


Several school practices need to be consid- 
ered thoughtfully if reading instruction in Social 
Studies is to be improved: 

1. The use of a “general” reading test to 

identify all reading needs. 

2. The practice of teaching reading as a ‘‘un- 
itary” ability in materials taken from the 
field of literature. 

3. The use of a group, verbal intelligence 


test to estimate the intelligence of all pu- 
pils. 


A reading test appraising ‘‘general’’ reading 
ability does not identify all reading needs. By 
definition, it is “survey” in nature and lacking 
in specificity. Frequently it is limitedtoa low- 
level type of interpretation. Furthermore, the 
usual reading test is composed primarily of ma- 
terials from the field of literature. 

Such a test is inadequate, in the first place, 
because reading comprehension cannot be con- 
fined to the interpretation of the Sense-meaning 
in literature materials. Reading is a complex 
process embracing many levels of interpretation 
and many different skills and abilities. 

In the second place, the reading skills and 
abilities necessary to adequate interpretation 
Vary considerably within and between the various 
Subject-matter fields. 


critical reading comprehension, should be ident- 


Scores in this study indicates that th 
test ed lacked the ability t 
studies materials Critical 


To appraise a retarded 
pacity by means of a 
test is a highly quest 
amount of relationsh 
al intelligence test a 
in this study as well 


reader's mental ca- 
group, verbal intelligence 
ionable proce du re. The 
ip between the group, verb- 
nd the group reading test 
as in other studies implies 


that one can be predicted from another with c on- 
siderable accuracy. Therefore, a child unable 
to read cannot perform at or near his mental ca- 
pacity level on such an intelligence test. W hen 
reading retardation is apparent, it is advisable 
to use an individual measure of mental capacity. 

Another major need in education is the con- 
struction of reliable and valid measures to ap- 
praise critical reading skills in all subject mat- 
ter areas at the elementary school level. Criti- 
cal reading skills are not included in con ten m 
area tests available now. Sucha test shoul 
yield a score on each critical reading c ompr e- 
hension skill so that specific needs can be ident- 
ified. 


Suggestions for Further Research 


Further inquiry into reading comprehension 
appears to be warranted. The following prob- 
lems are in need of investigation: 

1. The relationships between, and interrela- 
tionships among the critical reading com- 
prehension skills in social studies. 

2. The investigation of other skills nec essi- 
tating a higher level of interpretation in so- 
cial studies. 

3. A factorial analysis of critical reading com- 
prehension in social studies. 

4. The investigation of literal and critical 
reading comprehension withinother content 
areas and between content areas. 

5. Investigations on the development of each 
critical reading comprehension skill. 

6. Studies to evaluate eff ective meth od 
for developing critical thinking in rea 
ing. e 

7. The construction of valid measures to Lp. 
praise literal and critical reading compre 
hension skills in all content areas at th 
elementary school level. 
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Io determine th purpose of this investigation was 
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Humber (21), Shores (28), and 
duced experimental evi- 
school level which re- 
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se dictate the skills to 
rticular selection, 
tent area does not 
ensure her, and (c) ability 
to interpret cont does not guarantee 
commensurate ability in à higher level of inter- 
pretation. 
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data at the elementary schoo. 
This study is one attempt to pr 
concerning certain skills used in геа 
ence materials at that level, 


Limitations of the Study 


Experimental Design— This study was primar- 
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ibi pling 
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urban Philadelphia and four urban Phila- 
delphia schools. Eighteen different clas- 
Ses were represented in the final popula- 
tion, nine urban and nine suburban. 

2. Age: The chronological age range was 
from 10-0 years to 14-6 years, inclusive. 

3. Sex: Both sexes were re presented in the 
study. 

4. Intelligence: The verbal intelligence quo- 
tient range for the population was from 57 
to 158, inclusive. 

5. Reading Grade: The Standardized reading 
grade ranged from less than 2.5 thro ugh 
10.3. 

6. Final Population Criteria: Only subjects 
who completed all tests were included 
in the final population. Of the six hundred 
eleven children tested, five hundred thir- 
teen had complete results. Two hundred 
Sixty-nine of the final population fell with- 
in plus or minus one Standard deviation of 
the mean on both the intelligence and the 
Standardized reading tests. 


Tests Administered— To Obtain a measure of 
reading comprehension on an untimed power test, 
The Gates Reading Survey, Grades 3 to 10, Form 
I, Level of Comprehension (published by the Bur- 
eau of Publications, Teachers College, Colum- 
bia University), was administered. To appraise 
literal and Critical reading interpretation of Sci- 

€ experimental edition of the 
Intermediate Reading Test: Science was us ed. 


» Form A (published by World Book Com- 


Terminology 
Ino ogy 


l. Literal reading is the ability to Obtaina 
low-level type of interpretation by using only the 
information explicitly stated, 


g is the ability to Obtaina 
n higher than that 


a. Functional Vocabulary tests the 


from subordinate ones, 
d. Key Idea tests the ability to identify the key 
or most important, idea in the story. 


e. Inference tests the ability to drawa Specif- 
ic conclusion from facts explicitly stated. 


f. Generalization tests the ability to identify 
a general conclusion or principle from in 
formation implicitly stated. 

- Problem Solving tests the ability to apply Í 
information from the selection to a pro 
lematic situation. | T 

- Association of Ideas tests the ability tos 
the relationship among ideas in a series; 

i. Analogy tests the ability to perceive rela 

tionship between two pairs of ideas. | 

Antecedent tests the ability to recogni det 

the word or words to which a selected pro 

noun refers. | à 

k. Sequence tests the ability to determine 
time sequence. d 

+ Extraneous Idea tests the ability to deter? 


ine relevancy of ideas to a particular Se- 
lection. 


m. Following Directions tests the reader’s 
ability to evaluate information as a prelim” 
inary step to executing or rejecting print- 
ed instructions, 

n. Visualization tests the reader’s ability to 
interpret a graphic representation of an 
idea verbally presented in the selection. 


a 


= 


c. 


ы 


3. “Survey” Reading Comprehension is a 
measure of understan ing based on the result$ 
of a reading test which uses content largely from 
the field of literature. For this study, the com 
prehension section of the Gates Survey was use 

4. “General” Readin Comprehension is use 
Synonymously with “Survey” Reading Compre- 
hension in this study. 


À Review of Kindred Literature 


Few studies have been reported that are а 
rectly relevant to a study of the reading сотр 8, 
hension of fifth-grade children. N eve rthele? 
the conclusions from many other studies as we 
as the opinions of recognized authorities h is 
contributed to the assumptions upon which 1ud- 
Study was based. Accordingly, they are ШЕ 1 
ed in this Survey of kindred literature whic lu 
here summarized in terms of the major conc 
Sions reported, 
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— ang Process 
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ing is that type of thous” 
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critical reading". 


Develo en 
Ir a 
pment and Appraisal of Critical Thinking 


А revie 
firm the f w of the research has tended to con- 
ollowing assumptions: 


1. Criti 
cal thinki z 
, ents (14, aking has a number of compon- 
* The 
merce cen d mechanism essential to 
usually I develops gradually and is 
nt in the indivi 
А Ed seven (6,8, 19) the individual by the age 
* Child 7 
of ШШ, observe the same general patterns 
reachin ng as adults but are limited in 
lace of g an equal degree of ability by their 
* Growth experience (18). 
бе ави certain components assumed to 
fected oe in critical thinking can be af- 
3 Criticar i instruction (12, 14, 25). 
ured by ТИПКЕ abilities and those meas- 
ntelligence tests are not identical 


(14,2 


Certai "T 

Sane ined sui rae abilities can be 
é ella and i РЕ a 

| Pencil tests (13 14 Beers and 

u ,14, 25). 
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Skills nalysis of Reading Comprehension 
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з. Poor joni and difficulties exhibited by 
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ere det Specific to each content area 
ermined (15, 23). 
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1. Reading is a com, i 
S osit 
ents and skills, Day bonis ун iii 
lated, identified, and i E 
і Lae Е manipulated 
à m specific reading situations (10 ij 
. Reading is essentially a thinking process 
TE effectiveness all the ele 
needed in criti inki Š 
a5). itical thinking (10, 14, 
3. The ability to read a i 
1 : passage with li 
mit ee does not guarantee the x БИТ, 
ity to interpret that selectio; iti 7 
10, 11,13, 39). n critically (2, 
4. The ability to read successfully inone con- 
tent area does not ensure equal success i 
another (21, 23, 28). i 
5. Certain reading and thinking skills are 
responsive to training (14, 27). 
In order to obtain a valid measure of read- 
ing ability in à given content area, s elec- 
tions from that field must be used (17, 30). 
1. The concept of “general reading ability" 
is not supported by scientific evidence (10, 


28). 


Test Construction 


erature on test con struc- 


A review of the lit 
conclusions: 


tion led to the following 
d reading tests are de- 


e the literal rather than 
n of the printed 


1. Most standardize 
signed to measur 
the critical interpretatio: 
material.(5, 10, 13). 

2. To date, in the research on test construc- 
tion, reading and thinking have been treat- 
ed as separate entities (5, 10). 

3. No standardized reading tests are now 
available at the elementary school level 
which measure children’s ability to think 


critically about printed materials (24). 

4. Test content should resemble as closely as 
j eatures exhibited by 

rials of the content 

area bein 


5. The struct 
11 controlled (7). 


ence the ге 
include those 
lary, (b) aver 

complexity of sentence structure (9, 16). 


y of Procedure 


Summar 

A summary of the procedure followed in this 
study is outlined below: 
liminary edition of The 
—Science in order 


literal and crit- 


he pre. 
Reading Test 
asurement of 


1s Constructed t 
Intermediate 
to obtain 4 me 
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ical reading achievement in science. 

2. Appraised the reliability of this edition by 
administering it to 143 children of grades 
four, five, and six in a preliminary study. 
Used item analysis on the test results. 

3. Revised the test to servethe major purposes 
of the study. Thereafter referred to the re- 
vised edition as the experimental edition of 
The Intermediate Reading Test— Science. 

4. Administered the experimental edition to a 
different population consisting of 513 fifth- 
grade children. 

5. Administered to the same population two 
standardized tests: 

a. Gates Reading Survey (Level of Compre- 
hension) to determine the ‘‘general’’ read- 
ing ability. 

The Pintner General Ability Test (Verb- 

al Series) to obtain an index of verbal in- 

telligence. 

. Estimated the reliability of the experiment- 

al edition of The Intermediate Reading Test- 

Science by using the Kuder-Richardson 

formula with the data obtained on the total 

population. 

Studied the reliability of each literal and crit- 

ical reading test item by: 


a. Inspecting the responses of the total pop- 
ulation. 

b. Computing the Standard Errors of the Dif- 
ference Between Proportions with the 
scores of the “good?” and the “ poor’’ 
readers (the upper and lower 27% of the 
population), 

Estimating the Pearson Product-Moment 
correlations between achievement in lit- 


eral reading and in each critical read 
skill, using the 
the distribution, 
8. Computed the Pe 


b. 


-1 


ing 
upper and lower 27% of 


; chievement, and verbal i 
ligence quotients, 


| by employin t 
Chi-Square test of Signifi P. TRAE 


: сапсе 
10. Determined the Point-Biserial Correlation 


total scores of the 
f the experimenta] 
edition and achievement 


Summary of Results 
— y of Results 


Problem I: Literal and Critical 


Reading Compre- 
hension 


1. The Pearson Product-Moment formula 


Problem IV: Literal 


yielded a correlation of . 67 + .02 Between 
literal and critical reading comprehensio 
in science. With intelligence held constan" 
the correlation was . 34. 


Problem II: “Intelligence” and Reading Compre” 
hension 


2. The Pearson Product-Moment form Een 
yielded a correlation of .83 + . 01 betwee 
the Pintner Intelligence Test (Verbal) апе 
the Gates Reading Survey (Level of C om 
prehension). la 

3. The Pearson Product-Moment formu ai 
yielded a correlation of . 75 + . 02 betwe 
the Pintner Intelligence Test (Verbal) ano. 
the literal reading section of the Interme 
iate Reading Test— Science. 

4. The Pearson Product-Moment formula 
yielded a correlation of . 67 + .02 between 
the Pintner Intelligence Test (Verbal) an 
the critical reading section of the Inte!" 
mediate Reading Test—Science. 


Problem III: “General” Reading Comprehensio? 


and Literal and Critical Reading Com p r ehe? 
sion in Science 


9. The Pearson Product-Moment formul : 
yielded a correlation of . 75 + . 02 bee, 
the Gates Reading Survey (Level of C Son 
prehension) and the literal reading eet 
of the Intermediate Reading Test— Scient? 
With intelligence held constant, the corr 
lation was . 35. Ja 

6. The Pearson Product-Moment formu en 
yielded a correlation of . 60 + .02 betwe®, 
the Gates Reading Survey (Level of C? 
prehension) and the critical reading $€ 
tion of the Intermediate Reading Test- 


ence. With intelligence held constant, 
correlation was .11. 


T 
the 


d 
n 
Reading Achievement ki 11 
Achievement in Each Critical Reading 5 
(Science) 


re 

7. The Point-Biserial formula yielded m 
lations ranging from -.15 to 4 . 47 being 
achievement on the total literal re ае нса 
Section and performance on each c r | “ (ne 


reading test item. Table XIII presen 
results. 


Conclusions 
Within the 1i 


inf 
owl 
mitations stated, the foll р 
conclusions оп 


нё 
each problem seem to be Va. 


Problem I: 


z 


1. There isa Substantial relationship 
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tween literal and critical reading compre- 
hension in science. 


Problem II: 


2. There is a very high relationship between 
verbal intelligence and “general” reading 
ability. 

3. There is a high relationship between verb- 
al intelligence and proficiency in literal 
reading in science. 

4. There is a substantial relationship between 
verbal intelligence and proficiency in criti- 
cal reading in science, 


Problem III: 


5. There is a high relationship between **gen- 
eral” reading comprehension and literal 
reading comprehension of science materi- 
als. 

6. There is a substantial relationship between 
“general” reading comprehension andcrit- 
ical reading comprehension of science ma- 
terials. 


Problem IV: 


T. There is a very low or negligible relation- 
Ship in science between proficiency in lit- 
eral reading and in each of the respective 
critical reading skills. 


The conclusions for this Study may be sum- 
marized as follows: 


1. Critical reading comprehension in Science is 


a complex of skills or abilities; each of which 
is relatively independent of the ability to read 
literally. 

2. Proficiency in critical reading of science ma- 
terials cannot be predicted from scores ob- 
tained (a) on literal reading tests in scienc e, 
(b) on group tests of verbal intelligenc e, or 
(c) on “general” reading tests, 

3. Proficiency in literal reading interpretation 
of Science materials may be predicted with a 
fair degree of accuracy from scoreson group 


tests of verbal intelligence and “ general" 
reading tests. 


4. Group tests of verbal i 


al" reading tests tend 
mon abilities, 


ntelligence апа“ gener- 
to measure many com- 


Implications 


These four &eneral conclusio; 


ns seem to justi- 
fy the following implications for p 


education: 


In planning an anal 


ysis Program, both curri. 
culum workers and th 


© personnel responsible for 


the testing program need to give serious thought 
to the limitations of a group test of verbal inte. 
ligence for measuring the capacity of reta dee 
readers. Consideration also needs to be g xi 3 
to the inadequacy of either the standardized re el 
ing tests available at the elementary school lev e 
or the group tests of verbal intelligence for qus 
dicting proficiency in critical reading comp! 
hension in science. This school personnel sho es 
realize that since critical reading comp r on 
sion embraces relatively independent ey 
a valid diagnosis of that complex can be m cd 
only by measuring proficiency in each speci 5 
critical reading skill and by using materials үс 
specialized content. Accordingly, а comple? 
elementary school analysis program should i i 
clude, in addition to other tests, (1) an instr 
ment for measuring the capacity of retarde : 
readers, and (2) tests of critical reading com- 
prehension that would yield a measure of p rofi 
ciency in each specific critical reading skill i? 
each given content area. T" 
There is a crucial need for a new type of in 
Strument to measure reading comprehension at 
the elementary school level. The proposed in^ 
Strument necessarily would include items д 
signed to measure the relatively independent 4 А 
ities inherent in critical reading comprehensi? 
of a particular content. The obtained sco d 
could not be expressed as a composite score T: 
probably could be presented in profile for ke 
This would show the relative strength or we? | 
ness їп each specific critical reading skill oe 
therefore, could serve as a guide in the PrEP S 
ation of the instructional program in the var} 
content areas. at 
Classroom teachers should recognize th tive’ 
since critical reading ability consists of remm, 
ly separate abilities, the best procedure for ro- 
veloping critical reading proficiency is by Pop" 
viding instruction in each specific skill. = + 
timum results, this instruction needs to be 
tematic and direct, e.g., in order to de TOES 
problem-solving skill in Science, opportun? j- 
for solving problematic situations by using ving 
ence content should be afforded. By impre j 
ability in each specific critical reading S 1117 
the general level of critical reading abi 
could be raised, be 
Another implication from the study ae 8. 
Stated as a caution to classroom teach it ар” 
From the results obtained in this study ано? 
реагз obvious that this representative рор 10 
of elementary school children tended to n i 
achievers in critical reading comprehension of 
science. This was equally true for childr telli” 
superior intelligence as for those of 10W Чо 
gence. It appears vital to urge the classr: tic?! , 
teacher not to take the development of c r nco” 
reading comprehension for granted as a pen t0 
itant of normal or superior intelligence, b 


MANEY 


Tealize that it i 
Seed it is a skill that needs development 
P Children y, the teacher should provide, for | 
the critical r systematic instruction in each of 
interpretati eading skills needed for successful 
ion of each specialized content. 


Suggesti 
ggestions for Further Research 


An 
nore 
exhausti 
Poit aswaa as analysis of critical read- 
sion seems tobe inorder. Among 


€ prob 
low; lem Š P 
Wing: 5 that need investigation are the fol- 


ing 


Constructio ts for diagnos- 
n of instrumen 


Ing critic 
Vibes emt reading comprehension in the 
2. Study of content areas. 
and ues relationship between literal 
Other grade eid comprehension (a) at 
lent areas evels, and (b) in other con- 
nvestigati 

i a 

brected ee of the effect of systematic, 
Skills (a) ушш in eritical reading 
and (b) i each elementary grade level, 

ТАСЫНА ana content area. 
a i = Р 

еа of critical reading 
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Ersta 
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nd the adult world. A number of 


hil ееп re 
dea, ported children's perceptio 


Selye ior d 
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еге, О test t 
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со Or 
tie 
Suh, e inc 

misa Casi 
Ssi ngly aware of difficulties o 


hat 
ont Chi 
tle gry 
ig 


RECE 

n nt of su a the personal and social 
‘Tom the poi ren has been studied princi- 
int of view of adults. It has be- 


einer 
сер ASL н 
ері ngly evident that not опу аге adult 


lons 
of chi 
я hild i 
ild behavior important to our 


ing, b 

‚ bu 

9 dub s also is necessary some 
s perceptions of themselves, 


ques hav 
tions, od developed to assess these 
play and о from theclinical interview 
ructured tos her projective methods 10 
studi range of rl For example, children 
erpa 6 Variousl ur to fourteen years have 
aot (6), M y by Del Solar (2), Grilfiths 
?rvation ott (9), and Rogers (10), by di- 
Bt d AEN. questionnaire an 
iques. In these studies there 
ns 0 


re ande 
; control (6); b) household, soci 


mic relati 
& зав ре and activities (б 
cn ties of children as they them” 


shin. © them 
wes (10) (5); and d) self and other rela- 


e I 
et. e of these investigations» 
914, i School children ranging in 
› in a research progra = 


ed , 
h à 
ree major hypotheses. These 


t 
youn; 
O0; Ë chi 
š ildren of the early el ementary 


ulti evel 

e a А 

icha re aware mainly of those diffi- 
aggres 


rai H 
acterized by overt and 
at with increasing age the 


opc: 
ata T withdrawing nature. 
astia 
С Chi 
‘ties ar tenis judgments of b 
in greater agreement. 


re 
n grow older, parents”, teach- 
ehavior 


ial-econ- 
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ehav- 


dren 
Ero from the middle s OC 


. Louis, 


TIONS OF RELATION- 
THEIR FAMILY 
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g nature, than children 


missive or withdrawin 
cio-economic 


from the upper and lower 50 
groups (9). 

eses were gathered 
stionnaire, and 
terview. The behaviors 

ed as aggressive, delin- 
or non -compli ant. 


these hypoth 


Data to test 
d teachers by que: 


from parents an 
hildren by in 
zed were categoriz 

uent-related, withdrawing, 
The latter applied only to the home situation, 
therefore teachers' replies were not sc or ed for 
this category: In general, the findings supported 


the hypotheses. 
a simi 


h smaller 
e from 6 to 12), 
ilaren percei i 
s improvement 
om behav- 
e children reported 
alauthority, 2) their 
and3) the need for im- 
ta have sug- 
's perceptions 
to behavior. 


tion to parents, 


f classroo 


children also re- 
hoices (cf. 
tive class- 
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were available for study reports from 1) 
eps records, 2) the mental health service 
workers in contact with the mothers and with 
school personnel, and 3) in a few cases (N = 6), 
a Child Guidance Clinic. Also, at approximate- 
ly the time of the observations inthe classrooms 
the mothers were interviewd in their homes by 
trained interviewers during a two-hour period. 

Thus, data from 1) direct observation of the 
children in the social structure of the School 
classroom, 2) their mothers in the home situa- 
tion, 3) the teachers, and 4) the mental health 
Workers were gathered independently. The lat- 
ter two independent sources of data were used 
as criteria against which to evaluate the chil- 
dren's behavior. These raters were instructed 
in the definitions of general adjustment level de- 
veloped from Ullmann's earlier study (13). 


, 


1. A child who is unusual 
relationships with ot 
plishments. 


ly well-adjusted in his 
hers and in his accom- 


A happy child who gets along well and accom- 
plishes reasonably well the things that usual- 
ly go with his age and level of development. 


3. A child who is 
has moderate 
ing up present 


not so happy as he m ight be; 
difficulties in getting on; grow- 
S something of a Struggle. 


4. A child who now h 
is likely sooner or later to have, 


problems. 


The sample under Study consisted of third- 
grade children whose 


*» — Were printed 
board and Copied 


tion of several o 


S listed According to 
; for all Subjects, 
d form, two Space 
assigned, one for “best boy friend in the room", 
and the second for « n this 
room’’. Following t i i 
and friends’’, each 


report. | 

The second step in this m 
childhood behavior consisted 
ordering his preferences for 


ethod of sampling 
of each child rank- 
all family mem- 


Ë > t 
bers and for his two best friends, rating wem 
preferred individuals as “1”, the next е 
etc. Again his reporting was individually c ppt 
by one of the several observers. There po re- 
casional instances in which a girl or boy w osite 
luctant to indicate ‘‘best friend" of th SEE tili 
Sex but this was overcome by quietly -— e ê 
the child to complete the ‘‘game”. wen ce dn 
ed, the data obtained by this method p xem l 
and preferences of family and friends for ye 
of 91 third-grade youngsters from t h pepe ic 
rooms, one from each of three adjacent | i 
school districts. Although, in their папи 
of the 91 children had incorrectly f o1 Lo emit 
Structions, ten sets of ratings were cor able in 
without difficulty and only three were unus 
the data analysis. | eral Hye 

The Чаќа permitted the testing of se VA n the 
potheses which had been developed (3, 4) E (10) 
larger project. In an earlier study, d teste 
had reported some of these hypotheses an sonal’ 
them in the development of his ‘‘Test ey vas бї 
ity Adjustment".l The first stated that е he 
disturbance in the child as assessed by 1 ated t 
and trained mental health worker was io Thu$: 
his perceptions of the family constellatio o pat? 
it was hypothesized that a child who desc would 
ent as “1” and the other as “3” or Eo ed the 
Show greater disturbance than one who ildren wh? 
parents in “1-2” order. Similarly, chi pare? 
preferred friends to sibs, and, also, p irth 07 
Who rated those sibs just next to them n n án he 
der as least preferred (sibling rivalry) inh 
Sibs and friends would be rated as distur a study: 

Along another dimension of the тезе amily 
the hypothesis was formulated that size puit i 
would be related to the degree of disturba 50 10 
the child. Тһе analyses were designed а ша dif 
test the hypothesis that sex of the child dp геї 
ferentially contribute to the family anl E 5 at 
erences. Finally, as a result of Време v 
(1), it was hypothesized that the presence? tu io 
parents in the home would be related to th gra 
ance in the child. There were one or bo rh 2 
Parents in ten of the 88 family homes, P en char 
constituting a subculture with psychologio? the 
acteristics differentiating these homes іг 
other 78 homes, jude t d 

A scoring system was devised to inc se 
variations of preferences within the family, y d 
upon a total score of ten from which a poi rder” 
Subtracted for each of the following rank О 


an 
Father and/or mother rated other th 

or 2; ); Я 
b. Friends preferred to grandparent(8); sib 
Friends preferred to parents and/ 0 nt$: 
Other adults preferred to grandpare 
parents and/or Sibs; 


€. Sib next to Subject least preferred. 


Bp 
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The empiric 
mày be е scoring system just describ i 
y Rogers. ^e with a similar one qevelopedi 
© both узбе, children's ratings were scored 
i © of four bte Rogers' scores constituted 
ae Maladjustn ests which contributed toa ''fam- 
es to his tear u score” in evaluating respon- 
set test eae = Personality Adjustment. M 
ing 5-7 E the following 

If ther. 

Sibs DOE two or more sibs and one of the 
number in the subject is given the highest 

the family, 1 point. u 


If o 
ne of t 3 
an unus a ends is given a lower number 
mber of the family, 2 points. 


If Parent 


m S ar 
other e separated by two ratings (e. 8+ 


rated “1? 
1”, father rated '*3"), 2 points. 


ү parent 


in S are — 
85, Separated by more than two rat- 


4 points, 


If 
Parent 
s : 
receive highest number, 2 points. 


The à 


Rage, Sal, s 
erg’ cores obtained from the third- 


fam; 
т, Prefer 
trean? Members and among their respective 
е varie’. analysent classroom friends, Were 
(rag at lables of sct of variances associated With 
во 1168 of school Eun 

gene , Sex and criterion group 

ral adjustment); and by chi 


ча: 
the Te ап 
data Ses : 
3; where frequencies con stituted 


Spo @ seri 

ses , 168 0 

"lena" tested densa of the childr en's re- 

Over amis preferences for one or both 

ily members, one or more sibs 

li r pa rents, 

чеч д bject least 

Pas аре еге 
the. did 19. 


ang r 
the’ АШ I 
a red, Sib id “oe and friends, 
Valeri hus e subject was по 
SS SA Pothesis abo pattern, following ап 
le, S. ited ut order of pre 3 
Two a more than a ate, sub- 
o q patterns appeared— 
xh ав le and sibling next to the subject 
° derited fis fs preferred. Twelve © hil- 
Т. In se former pattern and 15the sib- 
Ыла, other instances the chil- 
op Ver = as more preferred anone 
Sng the E indicated of the remaining seven pat- 
“thy 88 children xata d 1-4 children 
k. | squa Patterns. rated their prefer 
Sing a pees of frequency 
correction for dis 


distr ppu 
continuity; 
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12) by criteri 
е атр ses; Add равана DEDE. 
Eastin alling analyses for the various dis 
ions, yielded no significant probabilities f b 
the chi square values. These values ran оа 
.00 to 2. 60, попе approximating the v "es po 
quired for à significant probability (P .05 - 01 = 
3.84 - 6. 64). These findings indicate that these 
were not significant relationships, for either bo 
or girls, between preference patterns and cn 
adjustment as rated by teachers and trained me 
tal health workers. This was true also for ше 
pattern oí *tnormality" which did not occur si zd 
icantly more often among boys or girls, or REN г 
those rated as adjusted or poorly adjusted. " 
The size of the family ranged from one child 
and his parents to nine children and their parents 
with the mean number of children 2.6 per family. 
The households were distributed as follows: 


Family Constellation N 
Grandfather with family 2 
Grandmother with family 5 
Both parents with family 3 
Other adults with family 3 
Only one child in family 9 
parents and two or more 

children 66 

Analyses of the variances associated with size 
f the third-graders, andcriterion 


t level) also did not 


eeffects. The 
o 2.26, values well be- 


ried from . 8 
low those required even at the . 05 level of signif- 
jcance. 

In treating 
in the 


sence of grand- 
efer- 


s low, as reported 
t twelve percent of the children 
households wit grandparents as mem- 
i es just reported, there 
s between pre fer- 


significant i 
s of children with grandpar- 
without 


above; 


ested here had been 


experience and constituted 
he design. Except for 
s did not 


i rom the 91 third 
tudy. Not unco m- 


The series of hypotheses t 


led studies. This 
py Kelly and 


e 
Meehl (8), and Rogers et al (11). In 
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these assessment and prediction studies, follow- 
ups of the clinical interview, case history and 
test data did not Successfully predict behavior in 
Spite of the clinical traditions Surrounding their 
use. It should be emphasized, however, that 
the investigators were aware of the yet-unsolved 
criterion problems. Д 
Another interpretation of the present findings 
suggests that these results may be a function of 
the developmental Stage at which the data were 
obtained—the preadolescent age range of 8-10 in 
which the family still dominates the personal 
and social worlds of the third-graders; and ex- 
tra-family group behavior, even in the social 
unit of the classroom, does not yet havegreater 
influence. Also, the ordinal positionof the child 
in the family hierarchy and the Special charac- 
teristics of the homes With grandparents as part 


of the family units may be determinants in chil- 
dren's preferences, Finally, 


pear to a significant degree, Suggesting the po- 
tency of this cultural characteristic, 


H. Sociometric Procedure 
— = £rocedure 


m observations described 
r, the teacher was Out of the room for the 
entire morning period, The cl 


Social environmental unit and socio- 


y of peer re lationships, 
Each child was 


€ Observers dis- 
couraged the few children who looked at earlier 
judgments, reminding them of the instructions 
given Previously not to look back 


king at 
the moment). g 

The six sets of judgments asked of the chil- 
dren were “decide which three bo 


this room you most like", « 


Ot like to play with’’, 
urself don’t want to 
do things 


latter two Judgments 


T, aS evolved in the 
research program (3,4) from Conferences with 


o instruc- 
the mental health service workers. The inst 
tions given the chilaren follow: 


Please look at the paper in front of n 
You see that the name of every boy and a 
in this room is on the sheet, andthe eei 
is next to the name. Now decide poe 
three boys or girls in this room you their 
like. (Pause) Put a “1” by each of d 
numbers (demonstrated on bla ë 
(Pause) Be sure there are three 1’s an nd 
Sheet, and that these 1’s are next "i 
numbers of the boys or girls you most E 
in this room. Now put a second “I TU 
onstrated) by the number of the Poy ones 
ou like most of all in this room. ard 
-— check responses to see that each pem 
has three Is, one of which is double.) Pas 
put the sheet under the others on your 


Results 


reat 
A number of statistical problems arose in y jon 
ing the data, from the disproportional dis : c spe 
of girls and boys, numbering 44 and Aia af 2%, 
tively, over the four criterion groups пен (N'$ 
40, 19 and 10), the varying classroom нете? А 
of 35,33 and 23), and the varying ДЕН E firs 
choices among the children. In БН ҮЙҮ (12) 
Problem, а correction for disproportion N'S ove 
statistically controlled the differences sid 518, Д. 
the criterion groups; in the latter pine he OF 
Striction of choices was treated as a Lin pez: 
Significance in the sociometrie des ign ts pis 
was hypothesized that a child who restric choloBi, 
choices to a Single classmate differs psy геге? 
cally from another who nominates three M 
peers in his choices, а 809 g 
In order to handle the second problem, yar yý” 
ing system was devised to account for the n gro” 
numbers of choices. This latter was take 
the following nomograph: 


Я ice 
Number of Positive Ch? 21 


6- 
0-1 2.3 4-5 


7 
0-1 4 5 6 
Number of 6 
2-3 3 4 5 
negative 5 
4-5 2 3 4 
Choices 4 
6-14 1 2 3 d 
e 
It can p 


ign 
, € Seen that a score of 7 was ass ли? 
individuals ¢ Osen positively six or pee o, 
and withno, orno more than one, negative h M 
and a score of 1 was Bivena child who was С cel, 
negatively six or more times and who had T^ р 
ПО, Or no more than one, positive choi c € 


| 
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coded sc 
o vede 

ie res distributed themselves as follows: 

re V 

à : s: 4 5 6 7 

9 11 30 16 10 7 


Analyses 
€ and ue distributions of scores by sex 
Sults indi iterion group were m d š $ 
amon icating signifi Г аде, with the 
med the latter. b г icantdifferences (P = .01) 
grou les. The m ut not for the sex and school 
an PS ranged ean scores of the criterion 
isa 3.2 ede 5.0 to 3.2—5.0, 4.2, 3.2 
u -0, 4.4, 9 
inga SUE and uode (Best ete according 
help) 9 4 (in need m health service worker rat- 
Varieg respectivel or receiving P sy chiatric 
from 4.5 Кз Mean scores of {һе boys 
1, but the em and for the girls from 
Signifi mean, 3,8; liiferentes between the sexes 
Соо] ал. t e н 4.3) were not 
tion 2,09), nor did y aiguiteanily (& 03, 4.06 
relati 09» termed ee of the various restric- 
categoric Ships” PES а of interperson- 
childr ies of child e latter consistedof three 
Of the с) Who sele ren—in the first were those 
thos ајот a Cr: a different peer for each 
three о ован те; in the second were 
in the үр ОЇСев (eith he same child for two of the 
Peer fo ird were ss positive or negative), and 
текче, Sach of th ose who nominated the same 
e = his Ийе three choices, i.e., a child 
gr dc in his rar relations toa single 
в, these © In his ne itive choices andto a second 
ереену гее кы m choices. The means 
pores ely. Wieden mene 4.0, 4.1 and 4.1, ге- 
sts "es not i s ir mean sociom etric 
Strigg; . SChool y significantly, interaction ef- 
ing lon^» and IPR (th ea e 
ex. SOC manifest e interpersonal re 
ating Ometrj ed by the children in select- 
Scho 8 that ic choices) w -cnificant, inii- 
81901 vari е associati ere significant, ind? 
ent 29 Tied from ion between IPR and 
Dee, pond 2 has to school. i 
byp Sen se each = dren who selected 
(s in sep tively a choice were 
12% conp Ol 3 the ore often than in School 3; 
spad 2 ntrasted mean sociometric was highest 
leteq respectively 2. 25 and 3. 80 for Schools 
We The teer eet for those children who re- 
co thers’ ratin 
sI pupun Sistent one of general 
op Tes г? With th h the sociometric 
by > a Tom 4 e latter’s choices r 
tp, 3nd 3 O ` 2 to 5.1 f d 
ea -0-3.3 Tor or the student 
Qin. atin Chers. ү r the students rate 
sd ince and th t should be remembered that 
S ега ependentl sociometrie scores Were ob- 
a ir ‚ 8nd pupil y, although, of course, 
As | they “lament S may have based some or all 
icate tha thesame behaviors. These 
s t crit at, with teachers’ r 
earl eria, students’ 506 iometr 
y as the third grade of school 5187 


adjustment 
choices of 
anging in 
s rated 1 


ае differentiate the levels of adjustment of 
m р! s: I this discrimination occurs on the 
ЫНЫР др choices related to “like”, “pla; 
ks 2 lemandingness, without reference i 
e adjustment dimension along which te 
rated the children. к 
In another analysis, by th 
the frequency with which boys chose other boys 
chose girls, and vice versa, was examined c 
example, did the boys and girls tend to pick dis 
same or opposite sex classmates for their be 
tive and negative choices? The chi square bam 
bility values (P) varied by school and бавен. 
variable, as shown below. (NS indicates P value 
of obtained chi square was less than . 05.) 


e chi square method, 


Sho 
x EM mL 


Sociometric Area 
SS 
NS NS .05 NS 


Like most 
Like most to 
play with o 01 05 .01 
Least demanding NS NS NS NS 
Like least .05 NS .05 .01 
Like least to 
play with .01 NS .01 .05 
NS NS NS NS 


Most demanding 
These data pre dom inantly show si gnificant 
sex discriminations only in the play area, i.e., 
boys and girls chose their S ignificant- 
ly more often than they chose t 
In contrast, on the *demandingnes 


there were no significant discrimi 
both boys and girls choosing their same or the op- 


posite sex in roughly similar proporti 


¿qiking” area; only three of the six C 
had significant P- ith the children of 
School 2 iscriminating ОП 
unlike the School i 
in the area of 
hildren did no 
y more often than they chose 
classmates of the opposite sex, unl ike the stu- 
dents of 3. In eneral, the data in- 
t sex differences are sharply drawn for 

t for «qiking", and do not 

seem to ope ractions in which these 
third-graders p *demand- 

» 


ing’: 


iment has resulted in а sys- 


ratings 


classroom teacher, 
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ently made, were significantly related. Distri- 
butions of scores between the sexes and among 
the schools showed no significant variations 
along these two dimensions. 

The hypothesis about the relationship of gen- 
eral adjustment, as rated by teachers and men- 
tal health workers, to the degree with which the 
Subjects restricted their Sociometric choice S, 
was not supported by the data. The analysis was 
designed to test the idea thatthe more disturbed 
children, as judged b adults, make fewer rela- 
tionships than those students who have been rat- 
ed as better adjusted. The Significant interac- 
tion effects between school and IPR (interperson- 
al restriction) do, however, indicatethat the re- 
lationship between these factors varies from 
School to school. There was, further, support 
for the hypothesis that IPR is related, in peer 
judgments, to adjustment, for the data demon- 
Strate that children who had butfew interactions 
in the classroom were less frequently positively 
chosen and more often negatively chosen than 
those whose range of sociometric choice indicat- 
ed greater adjustment in the social situation. 

Sex differences were consistently obs erved, 
with but one exception, only in the dimension of 
play; and just the opposite obtained in the dimen- 
Sion of demandingness where boys and girls did 
not discriminate sex in their choices, whether 
positive or negative, and no m Ore often chose 
Same or the opposite sex. Again, there were 
differences among the schools, differences which 
have been found consi stently with respect to 


other measures in the overall evaluation pro- 
gram. 


Summary 
Ey 


As part of an evaluation program in com mu- 
nity mental health servic 
Were observed in the 
Social unit. A sample of 91 third grade school 
children were a 


their household, including family, relatives and 
other members regularl 


this list were added the 


5 in the home; Sex of the child: 
rin sd of Кл all related to disturbance in 
` None of these hypotheses fou 

Н па - 

Port in the data from the sample of chidren 1a 
ths present study. There was however 

modal’? perception of the fam = 


1 ily constellati 
which corresponds to the ; n 
Di eig Expected Pattern in 


Another set of observations consisted of a | 
ies of sociometric choices—‘‘like most’’, “шо 
play with", “not demanding’’, “like least”, зе 
not like to play with", and “demanding”. bae 
data are reported in terms of the number and vH 
ity of peer relationships, their relationship to {> 
ings of general adjustment, and sex and sc Де 
differences. To enable the treatment of da a 
gathereafrom the children’s sociometric choit 
a scoring system has been devised to proyide f 
index of both number and direction (positive cio” 
negative) of choice. It was found that these so i 
metric scores significantly differentiated the toia 
levels of mental health used as criterion me й 
ures. Further, it is significant that there i3 hig 
correspondence between the criterion ratings m 
trained mental health workers and by classro0" 
teacher, and the independently made pupil gor 
metric choices. Thus, these data indicate th ad” 
with ratings by trained adult raters as іпаере! 
ent criteria, students' sociometric choices sie 
early as the third grade of school significantly di 
ferentiate the four levels of psychological adju? 
ment of school children here specified. Finally 
children with but few classroom interactions E. 
more often negatively chosen and less often po" 
tively chosen by their classmates, suggesting er 
the degree of interaction in the class may be e 
of the socio-psychological dimensions along ij ge 
the children range their evaluations of ise 
mates. The data summarized here also indic 
Significant variations in these interactions fr 0 
School to school. 
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was developed by Rogers «while 


child were obtained from clini 
social workers 


psychologists, 
te knowledge of and contact 
ratings were then com- 


es on the test. 
group 


—those who fe 
give certain respnse 8. 
dreaming children tended to give other re- 

And so on with other types. From 
these typical re s it was possible to 
build up à scoring system which applied to 
other children 


adjustment 
— tended to 
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A RE-EXAMINATIO 


IN LATE CHILDHOOD, AND D 
HIGH SCHOOL PERSONALI 


RAYMOND B. CAT 


U 
oped ee VERY recently in the history of fac- 
Only one eee measurement, there has been 
use ata actored questionnaire instrument for 
named upon level. This instrument, first 
ected Foe Personality Quiz (IPQ) Wt 
by Cattell an research which has been reported 
(12). ӨП and Gruen (15), апа Cattell and Beloff 
ly bear sychologically, the JPQ factors general- 
een a relationship to factors Wo» E 
lon has b also at the adult level, and this rait: 
ап instru ееп proved by correlation (12). H su 
E Ex is to become widely use% a i 
Owever ical or research purposes; here € 
tion of te need, scientifically, for re-examina- 
Оп an і e Structure of its component factors» ч 
Some n ependent sample (for experience sho s 
Su S ten in simple structure). There iS) 
Scale Pi ула need for the extension 
creased 9 an equivalent B form: to а 
Imm che abies possible from longer teS ^» 
(1) Check ss practical aims меге, refo Te 
ational © the factor structure, particularly s 
: atone en (2) intensifying the mani 
identific ing the scales, and (4) checking 
ished zm of the factors against those 
9 adati ater ages, and possibly adding 
Portant ional factor scales that might pr 
Unfam il j However, for the benefit 0 e ger 
Publicati with the several, previous, elate 
Ош дан Ge V Bo, 15, 17), it should be ointe 
testionnas research on personality E 
Tough naire responses in the age r Ys 
More pat! years, is also а planned art 0 
i &eneral basic research program are 


g 
ong 


rot 


5 


ermina osram has had asits object 
yie OÑ of personality structure, ? 
attac and related methods, in а ©° 

©К, (1) over the three possible medig 


r 
al observation, namely, L- u Ай ^ 
eet 


ng 

per з 

by, Personality and motivation structure- fede- g 
о 


N ОЕ pERSONALITY STRUCTURE 


EVELOPMENT OF THE 
IT QUESTIONNAIRE 


TELL 


RICHARD W. COAN 
HALLA BELOFF 
University of Illinois 
l. T — MÀ 9 

he Setti š А i behavior, in situ; Q-data, or response to ques- 
Structure Hos ше Problem in Per sonality tionnaires, from introspective self-evaluation; 
2Tucture Research and T-data, or object ive non-self-e val uative, 
pehavior, and (2) over the develop- 


test response $ 
mental age range, by cross-sectional struc tur- 


ings at the adult level, at 14 years (as here)at10 
at 7 years, and at 4 years of age. The 


years, years, 
general coordination of L-, Q- and T-data find- 
ings 1$ discussed elsewhere (17), and this account 

ionnaire findings in 


nly to the point of referring 
to their integration with 
poring ages. 
2. Design of the Experiment 
The present design called for the invention of 
tensive new ool of questions, out of which a 
high-school questionna ire could 
It was proposed to evaluate these new 
the factor structure 
. The new question- 
rk will be called the 
hool Personality Que stionnaire. 
nal JPQ uestionnaire takes its author- 
The ODE quests of 295 items (103 di- 
rectly factored) on 333 eleven- and twelve-year- 
s and girls, by Cattell and Gruen (15), and 
a subsequent questionnaire construction analysis 
and Beloff (12), which first positively 
the childhood factors in terms of adult 
16 PF factors (18). 
The resulting JPQ test has been a valuable in- 
terim instrument, ermitting such res earch on 
ity factors as the determination of 
e-nurture ratios (13), andtheir predic- 
tive power in regard to school achievement when 
abilities ате held constant (9,23). Although this 
mhtirmed the psychological meaning 
and shown that they behave 


l gi the factors 
Y he ас: : 
MP s ted ted independence, ithas not allayed 


; that questionnaires with children 
er len than with adults to achieve 
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satisfactory reliabilities. Therefore, further 
Work was indicated, both to extend and intensify 
Scales, and to check the factor identifications. 
Design of the new items was carried out joint- 
ly by Cattell and Coan in America and Beloff in 
Britain, with the aim of obtaining sets suitable 
for international scales. Guided by the aim of un- 
iformly covering the personality sphere (9), an 
initial total of 450 items was constructed. This 


sirable extremity of 
(3) absence of suffici 
distribution of items 


n as might be found among 


› i.e., at the lower limit of 
the 12-17 year range for which the test was in- 


Phi coefficients were wo 
items and the original JPQ 
table. By takin 
Which had a mo 


rkedoutbetween these 
factors, ina 11 x 365 
g only those newly invented items 
Te central yes-no cut than 90%- 
10%, and which proved to have adequate correla- 
tion with one or more of the11 dimensions of the 
existing JPQ, the total Was reduced to 251 (plus 


132, constituting the non-intelligence items to be 
added from th Parentheti- 


trix. Parcelled factor analysis is in dic ated 2. 
the preferred technique when sound prior know 
edge exists about the nature of the general m 
Structure and the cluster affiliat ions of mos 
items. al 
In this case we were confident of the gener 
factor structure of the JPQ and inany case planne i 
to represent it in the new factorization by preti 
ing two markers from each factor. Th e tw 
markers were constituted by putting the 12, items, 
for each factor into two short (6 item) pis wert 
Scales. Twenty-four such markers for the twe d 
factors (intelligence was not included) were to _ 
the chief landmarks for Structure, and they con 
Stituted the first twenty-four variables inthe co 
matrix. We also knew the approximate factor со 
relations for the 251 items, from the rectangle 
matrix above, and, on this basis, these items 
Were placed in 46 homogeneous **parcels'' aver- 
aging six items per parcel, These approximate- 


ly homogeneous, but certainly not necessarily 
factor-pure, “parcel” 


terms of these 
ual’s factor scores from 


tion matrix (383 = 


The particulars of these calculations are given 
in the following section. 


3. Confir 


mation of the JPQ Factor Structure 
—— ° the JPQ Factor Structure 


The battery of 383 items Was administered in 
two sections, one day apart, to 168 12- and 13- 
year-old boys and girls in the Schools of Terre 
Haute, Indiana. Score distributions on each of 
the seventy short Scales were dichotomized at the 
У 70 “parcel” correlation m a- 


Served for rotation, 

Rotation began by the 
the markers from the 
rection to the first ele 
der being placed for hi 


trial vector method (24), 
JPQ giving rough initial d 3 
ven vectors and the remain 
igh communality and 2 8 
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minm iem possible. However, this 
25 merely a starting point, for no fewer than 
eee a erall rotations, guided by visual plots, 
DU. then made toward maximized and statis- 
tio y significant oblique simple structure. Rota- 
O when the percentage of variables in 
tee pans of the whole configuration had 
AA ы at the 20th rotation, to a plateau at 65%, 
ün when five further rotations failed to produce 
"s ar жың This is far beyond the P = .01 level 
PU сле of simple structure by Bargmann’s 
aia (2), and from the visual plots alone it is 
t ar that there is a very good structure. The 
act that the absolute proportion in the hyper- 
plane does not reachthe 68 or 70% level we have 
a sd in the 16 PF questionnaire (8), can prob- 
th y be ascribed, in view of later findings, to 
Ms lesser reliability of individual questionnaire 
ems responses with children, plurring the hy- 
Perplane from +.10 to a somewhat broader band. 
The unrotated matrix (Vo), simple structure 


M matrix (Vn), and transformation matrix 
n) are preserved at the American Documenta- 


E Institute, where microfilm copies m ay be 
obtained by ordering number 72459, while theco- 
ine matrix is shown in Table IV (a). 

rotated 


On examining the simple structure 


matrix, Vn, from which Table I is extracted, it 
for each of the 


а found that the two markers for 
IL twelve JPQ factors had faithfully ар- 
depend at the head of the fac 
Bin ndent simple structure r 
e misplacement in the case of 

ат P 8 as shown in Table 1, for in these three 
е marker had appeared correcty and one had 
cept for the 


tors in th 


thre e astray into another factor: 

dins off-diagonal values shown in the table, 

will r off-diagonal loadings меге negligible. ^* 
be seen, the old markers always retain thelr 


c + 
haracters and remain the highest Valve" in 


го 

ents os column for each factor. 

saliente wi of course, joinedan 
6 pur when the new items from ed 

doing?) ыз were similarly correlated (after өш 
оге ) with the factor estimates; as describe 

precisely in the next section- 


4. = са 
gg id point we had confirmed, to the dum 
ity f ed, that the eleven dimension? of pee 
Sphere, in to span the childhoo d ersonali Y 

re 2 the original JPQ research (12,15) 
tive correct in themselves and still representa- 
new and wet for two new factors, even among ? 
Wo nd wider range of response items. There 
naire Seem to be little doubt, aS far as d these 


thir 
со te 


Pre x 
hensive functional unities 1? 
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structure at the high-school age. 

Now it remained to determine the validity of 
every one of the 383 usable items against these 
thirteen uniquely rotated factors. This required 
that we estimate the scores of each of the 168 chil- 
dren on each of the thirteen factors and then cor- 
relate all individual test items with each and 
every factor. It will be recalled that each of the 
short scales or **parcels"', in which the deter- 
mined factor structure was expressed, is some 
six items in length. Since the salient parcels in 
any factor were, roughly, of the same order of 
loading, and since the items in the parcels were, 
roughly, homogeneous, the factor estimate was 
made by giving one point for each item and thus, 
generally, a 0 to 36 score to each factor. (It has 
been shown (1) that the extra trouble of fine weight- 
ituations is unjustifiable.) 


ing in these SY 
It will be recalled that our aim was both to ‘‘in- 


tensify’’ the existing 
rm A of the new test, called the HSPQ) 


and to develop 2 second equivalen 

HSPQ. Тһе term «ántensification" has been Spe- 

cialized in factor scale research (7,8) to mean 
loaded items— relative 


the геј lacement of lower 
ew items of higher loading, togeth- 


with the factor à 
the scale would be gaining ho: 


1d thus include renova- 


nly of items but also of exactness of fac- 


tor determination, a 
search by 


ror of the difference. | 
parcelled factor analysis (7,8) sometimes 
els to correlate items with 


stops at undoing parce 
the single factor in which the parcels are loaded. 
iment we did not follow this 
shortes , but correlated items with all 
other factors too. 
time permits, because 
rcels is only on the basis of rough evi- 
and there is certainly a pos- 
le item might load another 
e highly than that to which its parcel 
. Moreover, knowledge of the 
3 s on all factors is neces s ary 
loadings OL S inal assignment of items io tan 
wi ishes to prepare buffered 
16) by arranging mutual ‘supressor’ 
(16) among factors alien to the factor 
d. Accordingly, these relations 
13 х 383 correlation ma- 


re cal P 
we large matrix, but still less than 


pis is а 
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a thirti 
“юе е ioni y for the ordinary de- 
Which would b ysis of the same data—and 
ironic con e too large to fit any known elec- 
celled Prion program. The saving by par- 
inctissin or analysis is achieved, ho we ver, by 
getting ik the degree of inaccuracy involved in 
tor (of ie correlations with an estimated fac- 
ing loadin ut 0. 8 reliability) instead of determin- 
factor Sa Wi te 100 percentaccurate “pure” 
Же ект | as in typical factorization. 

cause of in , the 383 total was necessary be- 
euo eed thet both the 132 items of 
same ee JPQ and the 251 new items into the 
to know e For, of course, it is necessary 
Scales S4 ich of the existing items in the old 
be replace be of sufficiently low correlation 10 
Safe to "p and for this purpose it would notbe 
inal "uu Ack to their correlations in the orig- 
Strict Бо (11, 14) for these would not have 
replacin, pling comparability with the potential 
correlati items in the present sample nor woul 
сораган with true апа estimated factors be 
Shows able. Actually, the 13 x 383 matrix 
prüfisiat remarkably good confirmation of the ap- 
items. eness of assignment of the original JPQ 

Ph sles this point we can conclude: (a) that 
Sound; rm factor structure was € ssent ially 
in each ) that nine or ten out out of twelve items 
and rot vid scale continue in the new sample 
Original] ion, to load the factor to which they were 
icant lo y assigned with the right sign and signif- 

nm т Consequently, the intensification 
Placin ed in the next section had the task of re- 
More g only two or three items in each factor by 

Substantially factor-correlated items. * 
5. 
anie uction of the High-School Personality 

idue of sufficiently 
ligence factor 
bove analysis, giv- 
sonality fac- 


оаа ай initially, а res 
items) я (other than inte 
ing ten ia the target of the a 
ors for ems for each of the 13 per 
rme Sach OF Uie two (A and B) equivalent 

- Starting as we did with 4901167 (383 at 


factorizati 

Casas 2308, we found the above attrition pro- 
margin in factoring finally gave US just а narrow 
Items hs sufficiently loaded (0.20 ог higher) 
lon or ii this necessary tota. 
Ore, foll questionnaire from this P 
Ple fact owed the usual canons for a £0? 

or scale test, as follows: 

1. 

highly taga factor scale was built items most 
an eae on that factor, always having, at 
Yberpia; significant loading, well clear of the 


103 of the original JPQ item 


* 
Afte 
T deleti 
Form A a for other causes; 


ЇЧ 


2. Suppressor action was int 1 ined 
pally this meant, where good D Ns ша 
nificant loadings on other factors, ап atte a 
never to have more than one item onthe re ve 
factor that loaded (in the same direction) in. b 
same irrelevant factor. Whenever possible a er 
ing on an irrelevant factor Y, introduced into Раб. 
tor Scale X in ће use оѓ ап item “а”, is sup- 
pressed by finding an item “р”, also loaded RÀ 
factor X as is item “а”, but loaded additionall 
on Y, with opposite sign on Yto item “a” (16). 
See diagram for an illustration. š 

3. Each factor was planned to be score d on 
items completely independent of those scored for 
any other factor, to avoid spurious cor relations 
among factor scores due to sharing specifics and 
errors. 

4. Position and ‘‘yes tendency” sets were elim- 
inated. As the work of Berg (2), C r onbach (20), 
and the present writers (9:245) have shown, the 
tendency to agree rather than disagree is (a) sys- 
tematic and (b) personality correlated. Conse- 
quently for each ten item factor scale іп each form, 
five items were chosen to contribute positively 
with an “а” or “уез”! answer, and five with a “b” 
or ‘‘no’’ answer. 

5. The total pool of items of satisfactory valid- 
ity for each factor was sorted into ten items for 
the A and ten for the B form, in а way to give 
equal total validity (equivalent “wanted factor" 


saturation) to these equivalent forms. 
6. A fourteenth scale—the intelligence dimen- 


sion— Was added to the thirteen personality factors 
by taking the most “g” saturated and equal diffi- 
culty-spaced items from each of four sub-tests of 

the senior author's Intelligence Scales (4) for chil- 


dren of this age. 
1. The 140 items 
items) per form were 


(130 plus 10 intelligence 
finally arranged as to fac- 


iors in à modified cyclical order, to separate 
items pertaining to one factor and to give maxi- 
mum convenience in stencil scoring of a machine 

ions of the factor items, 


answer sheet. The posit 
and the positively scoring alternatives for each 


factor were also arranged to be identical on the 


A and B forms. 

These canons of good factor scale construction 
contain no reference to some of the procedures 
currently popular among educational ps y c holo- 
gists and sociologists, for the sufficient reason 


that such scaling habits apply to "itemetrics" 
as defined else- 


rather than ««factormetrics", 
where (16). Mere inspection may suffice to define 
ducation and sociology, and homog- 


«content in e 

enizing techniques can do the rest; but this is not 
ex field of personality. Walker- 

ing (22,25), for example, does not 


Guttma 
pt by chance, a sim ple-structure, 


produce, exce 
s have carried over into the H3PG 
> 


78 JOURNAL OF EXPERIMENTAL EDUCATION 


factor-pure scale. Again, high homogeneity of 
items, as by Cronbach's alpha coefficient (19) is 
obviously no guarantee of factor purity. Indeed, 
in multiple dimensional material, afair number 
of items designed by good factor loading and sup- 
pressor action to be most valid for a unifactor 
scale are likely to show negligible intercorre- 


Scores possible—all passed or all failed — on 


as Table VIII 
tic judgment 
rigid “minimizing”? rule for item 


shows, our selection by an artis 
rather than a 


ina normal group 


ing distance’ for ab- 
normal cases. 


The reliabilities and val idities from the above 
scale construction have finally been examined 
experimentally under the following definitions: 


(as “indirect 


validity’’) ale’s correla- 


tions with 


For example, 
ate -0.4 with 


» Tespectively, the 
alidation, in the realm of 
concept (construct) validity. 

2. Reliabilit 
ten different co 


e the Consistency, Stability 
The first is exam- 


ndom, but m utu- 
ally independent cuts. The second is a test-re- 


versions of each scale. 
These are set out in Table II. 


Diagram 1). 


The equivalences, and still more the consis- 
tencies, run somewhat lower than those to which 
test users are accustomed on non-factor scales, 
but the meaning of this is best discussed in the 
following section on validity. For it will be noted 
that reliability in the sense of the dependability 
Coefficient (9:352), i. e., freeuom exper imental | 
error of testing, is as good as is ty pically 
achieved by scales of this length (10 items). Al- | 
lowing for the fact that function fluctuation on 
Some personality factors, e.g., Surgency-desur- | 


gency, is known to be appreciable, the stability 
coefficients Suggest that actual test dependability 

is high, and the dependability coefficients (im m e- 

diate test-retest, gathered on only a few cases 

and therefore not shown above)are in fact high, 

though Systematically not as goodas for adults on 

10 item scales. The marked parallelism between 

the consistency and equivalence coef ficients, | 
along with the high dependability, POints to the 

lowered Consistency and equivalence values need- | 
ing to be understood in terms of factor validity 

differing from the usual “homogeneity validity” 
as discussed below. 


6. Validities: Direct and Indirect 
MM rect and Indirect 


It has been stated 
Sists in the scale cor 


cumstantial validity2 e 
of the scale with the p } 
mated in two distinct Ways in Table III: 


l. Taking the known factor loaaings (actually 
reference vector Correlations) of the items (cor- 


| 
the case Where wehave correla- 


: : WO toget e de- 
tons with all else Which they ri n ш. 
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imate here) and equal weight, which our scoring 
system gives. i 

2. Taking the square root of the equivalence 
coefficient. This is approximate because it as- 
sumes that complete suppressor action has been 
achieved, so that no unwanted common factor var- 
iance is shared by the A and B forms, but onl y 
two specifics. However, appreciable parallel- 
ism of results from these two approaches—show- 
ing higher validities for factors B, C, H, O and 
Q4, and lower for A, E, D, I Q2 and Q4—sug- 
gests that the approximations are m inor. The 
tendency of (1) to run consistently higher than (2) 
in Table III is discussed below. 


correlate in the Stated, expected ways. We have 
taken the universe as that of all other known per- 
sonality factors (including intelligence) in the 


later childhood age range. The c Osine matrix 
((a) in TableIV) shows the c 


ircumstantial Validity 
) withacorre- 
e., the extent to which 
rrelations with other person- 
ales agrees with the pure fac- 


note that fac 
higher values on both di 
ities, while A, J 2, and Q 

e J, , 3 tend to run low 
9n both, Consistent With other indications. For 


T. Standard Scores and Standard Interpretation 
of Factors 


As pointed out above and elsewhere (9) it does 
not suffice, nowadays, to identify and mame a 
questionnaire factor from the “face validity of 
the questions which enter into it. It must be identi- 
fied either directly by correlation with a behavior 
rating (L-data) factor, or by correlation wi tha 
questionnaire factor previously thus identified by 
criteria. 

It was planned in this case to administer the 
HSPQ along with the recently checked and intensi- 
fied 16 P. F. (7,8,18) toa group of children 
(a) at an age in the border range of overlap hes 
tween the two scales, namely at 16 and 17 years, 
and (b) also at an early or middle age in the HSPQ 
range, 12 and 13, but to a group of children of 
above average intelligence, so that they would 
have no difficulties of ¢ omprehension with the 
16 Р.Е. test. The latter would bring the benefit 
of both an independent sample and of a check on 
the effect of age upon factor meaning. It is ob- 
viously important in such Cross identification to 
use the full length battery on both sides, for the 
reliability of a single factor measurement on one 
form only is such that the agreement of one factor 
with itself (across two forms) might not be rel i 
ably distinguishable from the correlation norma!- 
ly existing between one oblique factor and another 
(when Sampling error is added to both correla- 
tions). 

Since the agreement between the 16 P. F. cor- 
relations from the two samples (one of 175 cases, 
the other of 121) is represented by a correlation 
of 0.7, their results are combined here, for econ 
omy of presentation, into a Single set of meanr i 
; for simplicity, re 
cords only values derived when both r’s were sig 


-01 level inthe two ona 
al contributing tables. It will be seen that ther 


estion being significant beyond the P = "p^ 
level, (2) the r's also being simultaneously "i 
highest in the row and inthecolumn, i.e., the be 


or approaching from either 


be nee (Table П апа ref., 18). It will further 


Ority of identifications аре 
ntly, but tentatively, ma 


HSPQ FACTOR INTERCORRE 
VALIDITY COE 


(b) 
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J 22 
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in the original work on the JPQ (11). 
d This table, however, brings out rather poor 
efinition of A (agreeing with itslow reliability) 
Чил as good, incidentally, as for the veter- 
an intelligence factor B, on the same fewness of 
items. G and Q3 are also weak, and again in 
rough agreement with the reliability evidence. 
However, in the case of D and Q3 the peculiar 
Situation arises that the highest r in the row for 
16 P. F. factor Q3 is D, not Qg. We have exam- 
ined this problem in the light of much more evi- 
dence than can be set out here, notably the ef- 
fects of reliability differences and the indirect 


validities, and have come to the conclusion that 
D and Q3 are probably incompletely separated in 
p.F. has а good 


the HSPQ and that Q3 in the 16 

deal of D(-) in it. Psychologically, the differ- 
ence is that D is excitability, while Q3 is self- 
control, the former probably being tempera- 
mental and genetic, and the latter connected 

with the rise of the self-sentiment. Special re- 
search will be needed to separate them in che 

adult, where they seem to behave virtually as 
face and obverse of the same pehavior. It will 


It will be noted that M, Nand J have no correla- 
tions with the other scales equal to their corre- 
which adds to the proof 


lations with themselves, 


of their separateness (Table Ш; ref., 1 1 
the identifying 


в, Q2, Q н 
ће same scale, ії 


must be remembered that in 
in the no-man’s-land necessary to 
scales, factor B and Q2 in the HSPQ are too 
easy and childish, respectively, while 16 P. F. 
ae, B is too hard and factor 2a de ed ad 

isticated in activity reference: vio А 
however, further ҮШ sification of the HSPQ fac- 
tors is still necessary, and is being Car” 
to get still clearer 16 P. F. identification an 
Mutual separation. 

The standardization of the HS à 
às to its factors, and with factor B (intelligence 
сей, was carried out with the following objec- 

s: 


1. To take American high 5 
"i x uh 16 as the reference gute 
, 2. To standardize separately 1° 
lees For, as the results in аре VII show, 
usc y With ова of the adult 16 P 
е significant sex differences OP about hal’ P. 
{се personality factors. the other hanc; ag 
fcio are significant (Table vil) on. 
бане d and for this reason, to 2 id ene 
th of standardization tables, it Ө prop 
in these cases the raw scores e | 
e 
Orrected for age before entering the main 


tables. 
3. To have norms available both in point 


scores (in this case stens) and in ranks (in this 
case deciles). Stens are preferred to stanines 
by most users, apparently because use of the dec- 
imal system has accustomed us to ten point scales; 
but the provided mean and sigma of raw scores. 
in the tables enables stanines also to be readily 
calculatea when desired. The stensare standard 
stens (18), that is to say, they do notliterally rep- 
resent half sigma units along the obtained distri- 
bution, but units fitting the areas norm ally cov- 
vered by half sigma when the obtained distribu- 
tion is made into a normal curve, by expanding 
or contracting the raw score SC ale on the base 
line to produce a normal curve. In other words, 
ile values translate into one another 


sten and dec 
according to the standard relations of the normal 


curve. 
4. To have norms poth for single forms A and 


B, and for the two equivalent forms whenused to- 
ether. For, due to lack of perfect correlation 

e standard score on A and B will 
ally the mean of that obtained from the 


This three-fold presentation, tog eth- 


two parts. 
er with sex d decile al- 
ternatives results in twelve 


which have been supplied with the Handbook (11). 


It is of theoretical interest that the larger sex 
e—indicating boys to be more 

e and of greater ego strength, and 
decidedly more premsic andof greater 


ance and lesser guilt proneness (0 factor) found 
for adult men is only found in boys here at a low- 
er level of significance. 
age trends are not so consistent, for S elf-sen- 
timent control is falling here, in early adoles- 
cence, whereas it rises slightly through post-a d- 
olescent life. There is agreement, however, on 
rength and decrease in ergic 


increase in ego st. : 
tension throughout the life course. Age correc- 
tions are finally recommended in the Handbook 


Intelligence, and Q3, self-sentim ent 


nt initial norm tables (in skeleton 
VIII) are based on asample of boys 
and girls, from 12 through 18 years ofage, gath- 

m 17 schools in different parts of the 


ered fro: š 
country, but mainly from middle s ized towns in 
the midwest states and Texas. As the values in 


ful in choosing 
that the mean raw SC ore occupies the approxi- 


mate center of the possible raw score range, 
while three times the sigma, each way, spreads 
out to cover the possible raw score range. 
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TABLE VII 


SIGNIFICANT AGE AND SEX DIFFERENCES ON PERSONALITY FAC TORS 


1. Age Trends* 


Factor T Differences P Value (Double-tailed) 
B Intelligence 3.57 Older higher . 001 
[c Ego-strength -81 Older higher Not significant 
E Dominance -69 Older higher Not significant 
G Super Ego Strength 1.22 Older higher Not significant 
J Coasthenia 2.32 Older higher -02 
Q3 Self Sentiment Control 3.70 Older lower -001 
Q4 Ergic 


-T1 Older lower Not significant 


*This part of table is restricted t 


o factors in which the trend is the same on the A and B 
forms, t values are for the co: 


mplete battery Score, on 500 cases. 


2. Sex Differences 


Factor T Differences P Value 
Cyclothymia 4.66 Girls higher .01 
Ego strength 


A 

C 9.65 Boys higher .01 
F Surgency 3.33 Girls higher 01 
G Super Ego Strength 3.63 Girls higher .01 
H Parmia 2.70 Boys higher -01 
I Premsia 12.14 Girls higher - 001 
J Coasthenia 5.55 Girls higher 01 
Q3 Self Sentiment Control 


2.96 Girls higher .01 
Note: From complete battery (A and B) Scores; 333 cases. 


TABLE УШ 


CENTRAL TENDENCY AND DISPERSION ON ALL SCALES 
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Note: Based on 1089 boys and girls aged 12 through 18 years. 
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DIAGRAM ji 


THE DIFFERENCE BE 


TWEEN VALIDITY AND HOMOGENEITY 
WHEN SUPPRESSOR ACTION IS INVOLVED 
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8. Discussion: The Relative Value to be As- 
signed to Reliability and Validity Coefficients 


ME ede S ee oper accustomed to ex- 
four хелаш ев of 0. 95 and upward (at leastin 
е catalogues: ) for achievement tests, or in 
i Aa attitude questionnaires, some un- 
ona ing is required of the changing im por- 
Tee uei and validity as one turns to 
ree, tests. On the whole, the present 
ten diem ensional scale presents validities (per 
BEhiey i higher than have hitherto been 
‘Se ү їп {һе personality field; but the consis- 
Se (homogeneity) and equivalence (A and B 
шын” or with the 16 Р. Е.) coefficients arelow 
int ive to the stability and validity coefficients, 
erms of experience in other areas. 
dm T impression may perhaps be sum- 
not хе» їп the observation that reliability does 
oo validity to the degree one expects 
Now experience with specific educational tests. 
ed s sewhere (9), we have tentatively stat- 
eve i principle that there is a tendency for most 
vi ay life behavior (as in the content of ques- 
(a) naires) to be factorially complex when it 
him “involves” a lot of per sonality, i.e. 
iid esirably high personality factor lo ading 
eene si and, therefore, (b) avoids suc- 
ation ully that degree of specificity of item situ- 
from кшш would make it of 
cult sample to sample and sub-culture to sub- 
compas Apropos of the latter, it has been the 
i бы. of many psychologists that highly 
sapa UU scales, whether obtained by the 
uniques of Walker (28) and Gu (22), or 
(Whi r devices for high internal consistency 
in io lead to virtually rewriting the same item 
(a) yl near-synonymous forms!) are generally 
very ma likely to deal with psychologically 
е наад specific interests, etc., than broa 
"ailes personality factors suchas general per- 
o ere d theory recognizes, and (b) more liable 
testing > from sample to sample, and from One 
a ns Situation to another, an instability in the 
ey d of whatever broad personality factor 
acters ТО It is as if the spec 
Single rie of a mere single item as we 
estin, con® instability with sub-culture an 
wire were multiplied in such highly 
his 9us scales to cover the 
Hd aple is correct, go 
act es are most likely to 
Ore огіаПу complex items, 
асе Se suppressor action; = 
Scale for 4 unwanted common factors» tog Š 
(a) hop a single factor. AS 2 consequence: 
Td gor ГЫШ tena to Pa low, and ae , 
аы S tha customed to using test: d qesiy- 
wee o oH ten or a dozen if it is judged’ g 
hic ar tain the consistency - Г el iabil m i 
€ easily reached with comparatively 


unstable meaning 


factor 
usin 
thane H 


i.e. 


ts wi 


items. (If the validity fr 


items in the specific-factor ‘‘homogeneous’’ 
of test. In other words, instead ӨР саканы ss. 
sential validity to a show of high consistency, and 
high equivalence-reliability coefficients we ard 
do better to choose our tests more by their real 
factor validity coefficients (concept validity), and 
gain high consistency and equivalence coefficients 
additionally, if desired, by lengthened tests. 

This point can be quickly and graphically illus- 
trated by Di i P =a 
rate y Diagram 1, which shows the basic situa- 
tion in suppressor action. Items a and b each cor- 
relate + .7 with the required factor F, but are 
chosen to have opposite sign loadings (+ .7 and 
-.7) on the unwanted factor U. If these were A 
and B forms (A and B forms could be made easily 
enough by multiplying such items) of a test, they 
would actually have an equivalence reliability co- 
efficient of zero. Atthe same time they would 
have separate validities of 0.7 and a combined 
validity of 1.0. This is nota trick case. The 
more realistically-complex experimental situa- 
tion which commonly exists, comprising specifics, 
error, and more than one unwanted common fac- 
tor, would merely complicate the form of the cal- 
culation but leave the principle s till operative. 
However, the systematic trend noted in our Table 
Ш for validity calculated from internal loadings 
(multiple r) to be greater than that calc ulated 
from equivalent form correlations shows that low 
homogeneity through widespread suppressor ac- 
tion is not the whole problem. Comparison of our 
HSPQ and 16 P. F. results suggests additionally 
that research should examine the hypothesis that 
in children there is appreciably greater function 
fluctuation on traits and lower dependability on 

om the second main row 

or attenuation by the 


of Table HI is corrected f 
test-retest error represented in the second main 


row of Table II, it rises to at least the internally 
calculated validity of row one in Table Ш.) 
Accordingly, if we wish to measure fourteen 
major personality dimensions in a forty-minute 
test, with resultant cut to 10 items per factor, the 
pest research on test construction atthis time can 
only produce reliabilities (equivalence or consis- 
tency coefficients) in the thirties—but validities 
in the sixties. Since the purpose of good reliabil- 


ity is to make idity possible, we should 
icients in preference 


]come this ord : ў 
he However, Since the maximum 


alidity for а given type of instrument is 
and something above the sixties 
must strongly urge that 
rch now possible on per- 
Ек А gh employing these fac- 
sonality т “Should: (a) use both forms of the test, 
per factor, and (b) take 


fatigue, an o sustain mO 
The writers wish to expr 
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THE BEHAVIOR OF TE 


ACHERS AND THE PRODUCTIVE 


BEHAVIOR OF THEIR PUPILS: 
L “PERCEPTION” ANALYSIS” 


MORRIS L. COGAN 
Harvard University 


Introduction 
The Measurement of Competence 


"ul RECENT history of education inthe 
i ed States is marked by numerous attempts 
d ag а the competence of teachers. In view 
e importance of formal education in contem- 
рогагу society, such а preoccupation is readily 
understood. The findings of competence studies 


zn, however, been inconsistent and unconvinc- 
ng. Many reasons may be advanced to account 
study of the teach- 


сн slow progress іп the Г 
{һе рее ер. Foremost among these is 
defi act that there is little agreement ona basic 
сае of the good teacher. Under such con- 
аура. when fundamental issues remain unre- 
ev ed, it is almost inevitable thatthe results of 
en the most rigorous research should be se- 


Miri criticized. Such conflicts, essentially 
ilosophical in nature, will not be resolved by 
ose of this ye- 


xs present study. The purp 
on rch is to provide some objective evidence up- 
нен ultimate value judgments may be based. 
meas many competence studies, e criterion 
experts, ! have been defined 25 
ness er supervisors, and principals. 
регіт Such measures is that logic 
Опа ао (at it has been relative 
ated t € that they are often not 
Piae pupil change, growth, d 
as vali these variables are generally recognize 
On ү criteria of teacher comP 
adopted e other hand, when pupil 
Operationa the criterion measure an 
ings is nally defined, the usefulness > ; 
TE e ОУ restricted by limitations © 
for th ments and techniques at present ava 
ment е measurement of subject-matter achieve- 
and of of growth in social and Lear nitg skills, 
plicato ESS in attitudes. To these 
ons a third may be added: i 


the opinions of 
The weak- 


* 
Au 
footnotes will be found at end of article. 


acher to whom such 
d—a problem that be- 
the departmentalized 


identifying a specific te 
changes can be attribute 
comes especially acute in 
grades. 


factors are invo 
first is that the criterion measu 
terms of the amount of work performed by the 
pupils. Such consequent measures avoid the dis- 
advantages of ratings by principals and supervis- 
ors. They fall short, however; of measuring pu- 
pil change. Nevertheless, it is felt that pupil 
work is very closely related to pupil cha nge in 
the learning sequences of the classroom. If it is 
at present impracticable to measure pupil change, 
then the measurement of pupil work as the vari- 
able intervening just prior to such change may be 


a productive concept. 
A second important element of the research is 


incon tradistinction 
a sort of global variable called “competence’’. 
The third major factor in the design of this 
study is its reliance upon the reports of pupils as 
the most important source of data concern ing 
their work and the behaviors of their teachers. 
5! ratings of the pupils’ work 
and the princ i pals* reports on the behaviors of 
their teachers are both included in the data col- 
lected, the primary e 
nce they arein an excellent 


cured from pupils, si 
ort on their own work and onthe be- 


haviors of 
The Theory and the Variables 


The dependent variables of the study are (1) 
the amount of required work performed by the 
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Pupils, and (2) the amount of self-initiated work 


ety-laden stimulus. They will, in ps ee 
performed by the Pupils. These scores are ob- to perform very little о ана зиде Recon 
tained through the responses to a questionnaire Шоро тст рою ty 
led the “Pupil Survey” ing longer ; í tha 
E required WOrk SUE 15 Secured by pre. to an unpleasant Situation. Onthe other hand, 
Senting to the Pupils a list of 30 of the most com- 


asked to report on their work 
teacher. The Scales and their 


: : Я t 
15 not given in for approach are termed * inclusive’? [оне Ше 
this subject when it is Elven I do it (1) a1 most tend to make him a cue for avoidance are 
never, (2) few times (3) Sometimes (4) man “preclusive”, : d 
times, (5) almost always Some illustrations The third independent Variable has been calle 
of the kinds of homework items Provided in the 


“conjunctive”, Although this designation is sug- 


t ercises; memori ze gested by H. A, Murray’s2 terminology, itis дег 
rules; solve number Problems; апа Prepareade- used to describe actions very differentfrom aren 
he envisioned, “‘Conjunctive’? refers to those 
The Self-initiateg Score is derived from the aviors of the teacher which g 
Pupils responses to 25 i i 


making visits to museums; doing less affect-laden than those of the inclusive and 
extra experiments. Six-point frequency Scale i 
1S provided with each item, runnin 


ey are, neverthel CEP 
g from “I considered to be a major factor in the teaching 
never do this”, to “I do it very often”, earning Process, 

The theoretical basis for th i 


acher as inclusive, pre- 
rawn from the 


Work of Murrays, Lewin, Lippitt, and White; An- 
imate to ara im i hat suc k is ргох- Gerson, Brewer, and Reeq5; Cattell6; and from 
teaches nct = E Process by whic the Writer S observation d experience. _ 
and the relations; ated to p Dil Change, me indications of the Organization of the in 
may be сеаномћ de Pupil Work to pupil change dependent variables, together with illustrative 
Mis А rdum Ows: : items for Sach, may Serve both to clarify the the- 
Pupils influences the nature anderen o the by «£l framework and to exemplify the method 
motivation of Pupils, (2) Communication With. by which Scores were obtained for each variable: 
Pupils, and (3) the “tone” of e classroom ex I. Inclusive 
Periences, which may instigate Certain pupil Á. Integrat 
work resulting in Pupil ch. B. Affiliate” 
he teacher ehaviors represented 4S the urt = 
first in the train of events €ading to Pupil š чаш 
change Constitute the independent events or i 
ypotheses, The Eory of the effects of седе T freclusive 
kinds of teacher behavior upon the Pupils’ work aativa 
is derived largely Írom the Social learning con ; pESTessive 
cepts of Miller ang Dollard. 1 ` "Sjectant 
These writers have developed Stron evi 
dence for the OCCurrence of TOCeg вва thers : Хошо 
by, if their Conclusions may be generalized to idis mg level o demand 
the classroom Situation, the teacher may become ieee ability to Communicate = 
on the one hand a Cue for anxj Yor, on theo “cat mpetence in classroom man 
for “liking”? ог “respect”, appropriate re. авец 
Sponse to anxiety is avoidance of Some sort: i 
appropriate response to liking is approach, Thus Чез (point frequency Scales provided for each 
the teacher who becomes 4Cue for Stron anxiety н 


Aggressive (; 
CCeptable mini- 28Егезсіуе (item 
mum of required Work; i.e., the Pupils wi 


30): Thi shouts 
"ped noi Yells at es. ): This teacher 
the most expeditious means of avoiding an anxi- m uat This teacher s ays that 
s 


Ought not to be in this class. 
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Integrative (item 33): When we start new 
work, this teacher helps us to see why 
this work is important to all of us. 

Affiliative (item 29): This teacher is 
friendly. 

Level of demand (item 57): This teacher 
requires pupils to do work that is Cor- 

, rect and in good order. 

Ability to communicate (item 
teacher explains things 50 
stand them. 


54): This 
Ican under- 


The Sample and Some Hypotheses 


S. population sampledc 
ool teachers in department 


решш Boston area. 
Mer sesuai were selected which d 
Eire dig paient Macs 
School ic criteria employed were: 
and years completed by persons 

over; (2) median income; 


males, 25 years old and over, with four years 
of workers 


of college education; and (4) percent є 
anging from profession- 

al, technical, and kindred workers to crafts- 
men and laborers. The two communities fin- 
ay selected exemplified sharply different socio- 
карш conditions. Thusi 
an organized *'socioecon 


ms evident. 
moe have been observed in 
ns are of course not gener? 
h munities within the population; 
and, the findings are more 
dues in which significant 
ces are not observed. * 
anc IG, ee difficult to indicate Wh 
т! be placed upon the resu = 

ir to say that teachers Were selected from two 
могаен differing in socioecono i 

eristics as sharply as P ithin the com- 
munities available to the rese 
"Ru problem of generalizing from 
Ee = findings is another matter. In 
the i measures observed for a 5 
Bam est of significance iS defi 
še pling of the pupils who are ta 
washer in the course of his profession 
“eure within-teacher findings, Жез To 
is ization to the pupils of the teacher inv 
s lausible; generalization to pupils 
if iro than those in the sample may be m 
БЕ findings about different teachers аге i 
ent i.e., if some regularities among 
PEE become evident from the analysis 0 
шасы. data. Such gen 
Samy would also be pound by th 

pling plan discussed above. 


j 


Data were collected from 33 teacher YN 

Data w ^ s, Хе 

principals, and — grade pupils 3 n five 

© pintor iugi sc s. The il s. 

the eighth grade level was iras in e m 937 
jmize the selective factor of school dropo “we 
which begins to operate strongly in the more ad- 

vanced grades. ‘An eighth grade sample is more 


representative of the total population of secondary 
school pupils in metropolitan Boston than is one 


drawn from higher grades. Neve rtheless, one 
can generalize the findings of this study only with 
extreme caution to any grades other than eighth 
since we do not know how pupil perceptions of 


teachers vary with age. 

The choice of departmentalized schools was 
dictated mainly by the interests of the writer. 
There has been relatively little research of this 
kind done in such schools, and an approach to the 
f isolating the influence of a single 


question 0 
teacher among the many with whom secondary pu- 


pils customarily work seemed to offer a challen- 
ging problem. 

It is within this context that the present study 
has been made. It should be noted that the meas- 
ures involved are derived from the reports of the 
pupils. Some respresentative hypotheses follow: 


1. Preclusive behaviors of teachers are nega- 
tively related to the amount of self-initiated 
work performed by the pupils. 

2. Preclusive behaviors of teachers are nega- 
tively related to the am ount of required 
work performed by the pupils. 

3. Conjunctive behaviors of teachers are posi- 
tively related to the amount of requir ed 
work performed by the pupils. 

4. Conjunctive behaviors of teachers are posi- 
tively related to the amount of self-inita- 
ted work performed by the pupil. 

5, Inclusive behaviors of teachers аге positive- 
ly related to the amount of self-initiated 
work performed by the pupils. 

sive behaviors of teachers are positive- 

of required work 

y the pupils, although this rela- 

eaker than that of inclusive be- 


The ‘ “perception” Analysis 


The central concern of this research is the in- 
vestigation of the relationships ofthree measures 
of teacher behaviors to two dependent variables— 
the measures of required and self-initiated work 
performed by pupils. The behaviors of the teach- 
ers are classified into three categories called in- 
clusive, preclusive, and conjunctive. 

The data have been analyzed from two differ- 
ent points of departure. The first of these is 


termed the perception analysis; the se 
д PST As th cond 
trait analysis. s the name of the former ae 
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i i his prin- 
k scores of the pupil. This approac j 1 
dinalis concerned with the individual Pupil’s in- 


intra-group r's (keeping in mind the ar 
definition of th) 


bitrary 
is Tesearch, in Whic group de- 
notes all the pupils reporting on a Single teacher) 
The basic data here are the raw s 

Pup: 


analysis, Teported els e Wher e, 
deals With the ay mea 


lth cor 
ted work Scor 
thus furnishing a pi 


. few id 
d questionnaire, s 
Standard Err 


Or of Measurement and 
Reliability of the Scales 
~~e Scales: 


each of the five Scales, 
Cient for each Scale is also 
assessment of a teacher bya i 


о 
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i š 5 nt is 
for the coefficient of а single assessme 


E esti- 

Where subscripts w and t denote the var ea catis 
mate within-teachers and the total va ihe results 
mate respectively. Table I presents the 
of the com utation. " ici- 

Inspection of the observed reliability =. 
ents for the assessments of a single РЫ eatin 
all five Coefficients to be quite small, indi differ- 
that the individual pupils’ ratings differ for 
ent teachers, 


Correlational Analysis 
tat Analysis 


f tof the 
The next step in the Statistical treatment of 
data is an i 


ation- 
examination of the degree of rel 
ship existing 


Pendent variables, 
lationship between v 


: rrela- 
from the instruments are Subjected to a co 
ional analysi 


api 
9 tests of significance rape 
efficients of corr elation: ( he t- 

nomial series? test, and (2) h coef- 
test of the Significance of 4 product-momen 


1 
smal 
ficient of Correlation as Computed from a 
random Sample 


“Sign” or “Ы 


. | suita- 
The binomial Series test is ee ee 
ble in the Present research because the 


irable 
tory nature of the study makes it seem desir 
to establi sh 


trend 
t © Presence or absence A b ref- 
in the relations ips among variables witho 
erence to t e та 


] hy- 
Snitude of the trend, The nul 
Pothesis to be teste 


is is the 
dis аёо = 0. If this is t 
n the sa; 


ifferent 
mple г values will be Girls опе 
ue to: Chance, and henc itive 
Serve about as many Lap the 
The statistic to be or ap- 
Positive (or negative) signs. T tistic 
Propriate test of the Significance of this sta rob- 
is to enter the table of cumulative binomial Pi m 
abilities foQ-p. -5 with the N of signs r ectiy 
9f negative (or Positive) signs and read di ша be 
e Probability that such a distribution cou 
attained by ¢ ance.9 т 
1S adopted for the Sign test, orre- 
T the product moment coefficients of c 
iap АРШ for the relationships нн аз 
les, the @PPropriate test of significan the 
Suggested by indquist, 10 is the t-test. In 
Present 


= ienificance 
Г application, the ‚05 level of signific 
1s adopted 


ienificance 
he . 01 level of significa 


^8, 
analysis of the signs of nate 
of the general direction o which 
determined, The t-test shows 
I'S are Significant, = tion 
©-order coefficients of correla bin- 
Were Computed for ach group for all com 
f ind 


iables:- 
“Pendent with de pendent variab 
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The intercorrelations am ong pesante 1 
among consequents were also mee x E 
coefficient was computed twice, With addition 
cross-checks for accuracy. Ва w-score data 
Were utilized in the Pearson EID TUE the pro- 
duct moment correlation Coefficient, - 

The Preclusive Variable— The Coefficients of 
correlation of preclusive with required scores 


number of Significant r's, itis Concluded that 
the original hypothesis as to the relationship is 
by evidence of the “Pupil Survey", 
nd, the fact that thefour 
I Sare negative, This relationship 
should not letely discounted. 
that the preclusive 


Xtreme a manner, 


Table Il, 
i 15.5 posi- 


n pon his pupils (3) ability’ 

to develop interest in the classroom or er P 

ences, and (4) ability t. municate With his 
It was hypothesized th; j 


ott at the 
positively related to 
quents, especially to th 


€ amount o; uired 
work performed За 


е S of the Correlation. 
е summarized in Table 


ё probability or 
Ve signs is far 
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less than -01, and the null hypothesis may be к 
jected with confidence. Sixteen оѓ the m Di. 
of conjunctive with self- initiated scores are 
nificant at or above the . 05 level. " ы Ый 
Comparable tabulations of the coefficients 2: 
conjunctive with required scores show on ly p 
negative sign (r = -.06). The null m 2 
to sign may be rejected with confidence. e pd 
teen of the r's are Significant at or above the . 

1. 
- is possible to Say that the evid ence of pe 
two tests applied to the relation of the conjane E 
to the required and self-initiated scores E 
Strong indication that the behaviors measured ру 
the conjunctive items аге 
іп а таппег indicating a f 


is taken of the attenuating fac- 
measuring instrument and its 


ients are positive; one 
esis may safely be 


9f significance applied to the rIR'S 
ificant Coefficients among those for 
Table IV Summarizes the distri- 


hoo» Significant at or above the . 05 lev- 
el, by School and by Subject, 


The results in the science groups are especial- 
/; Since the Coefficients of all these 
Significant. he sign test is not Sens 
у 2565, since the probability 
9f getting four Similar signs by chance is . 06. 
Another fact should be noted, however; the ze 
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TABLE II 


PRECLUSIVE SCORES WITH REQUIRED AND 


COEFFICIENTS OF CORRELATION OF 
CH TEACHER 


SELF-INITIATED SCORES, FOR EA 


Correlation 
NET uie 


School Subject Teacher n PR PS 
I Eng. 3 46 13 .03 
I Eng. 4 60 -.07 E vi 
I Eng. 6 48 -,28* -.40** 
I Eng. 1 22 -.28 -.16 
I Eng. 8 55 .09 .03 
I Eng. 9 29 .07 .06 
I Arith. 1 80 -.06 BC 
I Arith. 2 19 .09 -.18* 
1 Arith. 5 61 -.14 -.38** 
I Arith. 10 33 .18 11 
II Eng. 13 112 -.05 -.01 
n Eng. 14 93 .16 .00 
Hn Arith. il 88 -.04 .00 
п Arith. 12 122 -.04 .07 
ш Eng. 16 24 -.31* EX 
ш Eng. 19 72 2:19 .06 
ш Arith. 15 13 .34 .54* 
IH Arith. 17 83 12 .04 
HI Sci. 18 111 -.09 E 
IV Eng. 21 20 „28 a 
IV Eng. 22 21 т " "00 
IV Eng. 26 4 "24 "20 
IV Eng. 21 21 " * 06 
IV Arith. 20 21 zi š 
IV Arith. 24 54 =O" b 
S = н m m 
ts De A 15 142 -.51* 
v - 8: 33 19 -.46* -.21 
v "xd 23 49 .05 -.08 
V arith 32 E. E в 

* 99 54 -.06 
V 54 ED -.30* 


umb f Significant r's А 
ri L^ with Same Sign 


N 
Number of Significant Z s 


* Significant at . 09 level. 
**Significant at . 01 level. 
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Correlation 
School Subject Teacher n E MEME INE a s 
* 
I Eng. , 3 46 .21 19 .14 aoe 
I Eng. 4 60 .35** 30* 25* Э ** 
1 Eng. 6 48 .51** 51** .44** oor 
I Eng, 7 22 .35 36* .58** UN 
I Eng. 8 55 .32* .28* .29* 23 
I Eng. 9 29 .26 .23 .00 gota 
I Arith, 1 80 .19* 08 .14 pos 
I Arith, 2 79 -.06 .08 17 Е әй 
I Arith, 5 6l .1€ 21* 28* „al 
I Arith. 10 33 .26 07 38* m 
H Eng. 13 112 .21* 11 18* тө, 
П Eng. 14 93 .11 12 09 o 
H Arith 11 88 .16 05 28** i 26" 
z Arith 12 122 27 23** 324% pon 
E Eng. 16 24 25 31 65** Lese 
ш Eng. 19 72 .28** 20* 36** 42 
ш Arith 15 13 EU -.17 -.01 «I 
m Arith 17 83 <04 10 . 26%% po^ 
тү m. 18 111 E 33** .46** e 
v Eng. 21 20 18 32 18 pis 
Eng. 22 & . 59** 
IV Eng. n 21 .37* 45* „44% Ë 
IV Ene, a8 41 01 06 .17 E 
IV Arith, 626 21 14 -.04 .35 ee 
IV Arith 2i 21 .52** .49* 17 !gg** 
IV Sci. 25 94 .32** .28* 20 110** 
у Епр. 28 96 .18* .16 33** А 03 
Eng 31 E .02 . 09 -.02 n 
у Eng. 33 .59** 53* 50* A 
M Eng. 23 19 32 ET 50* ek 
V Arith 32 2 .32* .39** 28* - 
V Sci, 29 d .25* .21* 27** 30 
V Sci. 30 4 .27* 


Number of Significant r's 


* Significant at the .05 level. 
**Significant at the . 01 level, 
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TABLE IV 
DISTRIBUTION OF COEFFICIENTS OF CORRELATION OF INCLUSIVE WITH REQUIRED 
SCORES, BY SCHOOLS AND BY SUBJECTS 


Arithmetic Science 
Mee === — t 


English 
No. of Total Significant Total Significant Total Significant 
School Groups Groups r Groups r's Groups rs š 
i 10 6 4 4 2 
4 2 1 2 2 
2 1 1 1 


I 
HI 
IV 
V 


Total 33 


TABLE V 
LATION OF INCLUSIVE WITH SELF- 


NTS OF CORRE 
DISTRIBUTION ОР COB FED SCORES, BY SCHOOLS AND BY SUBJECTS 


English Arithmetic Science 
„== —====== 

ignificant Total Significant Total Significant 

No. of Sua Groups т?з Groups r's 


School Groups 


Total 33 18 
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Table V). It is interesting to Specu- 
a Dm on the possibility that subject 
differences are a reality, and that the relation- 
Ships among the var 
Science, but not for E 
these latter Subjects, Я 
tively, of ће г R'S аге Significant, wi 
cent of the ryg’s Significant in both 
arithmetic. 


Some of the questions that Come to mind are: 
(1) Is the j 


Í the pupils 
© experiences of Science 
classes that the major Crystalizing agency be- 
comes the behaviors of the science teacher? 

At the least, the Phenomenon of significance 


€ of all the Science groups sug- 
ting Possibilities as to subject dif- 


ferences that might provide a fruitful field for re- 
search, 


In the original statement of the 
this Study, th 


null hypo 
cance is beyond .01. 
The relationship b 

i d 


total sample 
in each 


ious as a result of throwi; 
ing unlike means, 13 


A solution to this diffic 
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ments. Three correlation coefficients may be p 
rived from these data: (1) based on the ea 
of-squares, (2) one based on the DM m 
(3) one based on the between-sums. The be es 
groups r is actually the correlation between “ү 
Х[1] means and the y [Ror 5] means for e 
groups.. - [and] àn appreciable between-group: Š- 
indicates that the total r is Spurious; this spurio 


i ` he 
ness is eliminated when ris computed from t 
with-in sums, ^14 


š š х, ае- 
In order to avoid such Spuriousness, the 


Sired coefficients of correlation for the TEM 
Sample and for each community have been emp ri 
ed from the within-sums. The total r's, the a 
tween-groups r’s, and the within-groups r’s ar 
presented in Table VI. 

The data of Table 
tion coefficient exce 
Community B is sig 
Single exception is s 


; a multiple regression е qu ahpn 
was considere The variables pus 
hly intercorrelated, 15 The 33 ps u 

C = .44 to гус = .98. 
Orrelation 15 TIC = .67, and all the 
2’S are Signifi 5 level. 


: Y which pupils could diffe теп" 
tiate between those teachers who were “conjunctive- 
and those who were “‘disjunctive- 
› if indeed such differences exist at 
There is à further possibility that the pupils 
Sir teachers in Such a unitary 
manner that the “halo effect” of the Strong, over- 
akes differentiation impos Si- 
le. As defined by ti 
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TABLE VI 


TS OF INCLUSIVENESS WITH REQUIRED WORK SCORES AND SELF- 


CORRELATION COEFFICIEN 
INITIATED WORK SCORES, FOR THE TOTAL SAMPLE AND FOR EACH COMMUNITY 


Total Sample 


Total r ° 
Among Groups r 31 .63 
Within Groups r 1152 .28 <.01 1752 35 
Community А 
Total ^x 926 .36 «.01 926 38 2 P 
Among Groups r 12 .64 «.01 р а PE 
Within Groups x 913 .22 «.01 1 Ў 2 
Communi 
d É 856 491 <. di Е а ER o: 
Among Groups r a „51 2 938 .39 «.01 


Within Groups г 


TABLE VII 


NT AND CONSEQ 


ALL ANTECEDE UENT SCORES 


SCHOOL MEANS OF 


новава Vert a Consequent Variables 
Р e R s 


School 


š 515 88. 51 45.20 
43 62.01 102.71 43.38 "T 
Ë 416 лї. 
42.26 19.34 
ш 05 10.25 59. 03 98.89 
3 k 
50. 33 108. 96 45. 98 25.35 
E i 
> ü i 104. 60 41.53 — 
bi 314 q1.41 


Wt Ls 
f nureau gani, Psy. Resear 
| payip НАЗЕ [il 


pated ЗЕЯ 
| dess No a 
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this phenomenon. 
n ac gs dindines of the preced- 
ing analysis are concerned, the a а 
conjunctive scores of a teacher may lave som 
validity as indices of the teacher s ability to mo- 
tivate his pupils, if the criteria of this ability 
are statedin terms of the pupils perceptions 
of the amount of required апа self-initiated work 
8, may be some interestattachedto a Sub- 


Sidiary treatment of the Scores derived from the 
“Pupil Survey". After 


are rank-ordered (the pre- 
ith negative Orientation), they 
Provide the data of Table Уш. 


With all the limitations of Such rank orders 
firmly in mind, it is 


Still a matter of interest to 
note that the School a 


It seems fair to say, at least, that the rank- 
Ordered means of Schools lend Support to the 


idea that there is Some intrinsic meaning in the 
Variables of this Study, 


itive r's, three are significant. The sign tes " 
gives an indication of a trend, with some Y £s 
ing evidence from the appearance of three signi P 
cant r's. However, in view of the small numbers 
of cases and of the appearance of three sizable 


Ors of their teach- 
О be significantly related to 
the perceptions of the pupils. 
Schools I and V sho 
toward agreement wi 
principal of school 
pupils' reports, 


“The Tea 


cher's Estimate of the Pupils’ Work"' 


Each teacher was as| 
pils on the amount of r 


Pupil himself, Coeffi- 
re computed for the con- 
The results are presented in 


In three of 29 groups, the sign TRtRp is nega- 
tive. One negative sign appears in the array of 
TStSp: All four negative coefficients are non-sig- 


y his pupils has a p 
to the pupils? 


Some reliance 


upon the repo 
their own perfi 


ormance of sc 


rating of the teachers’ be- 
Consistently related to the 

Pupils’ rating of the teachers. 
estimates of their pupils re- 
quired ang Selt-initiateg Work are signifi- 


P s 
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TABLE УШ 


RANK ORDER OF SCHOOL MEANS OF ALL ANTECEDENT 
AND CONSEQUENT SCORES 


LL 

Consequent 

Antecedent Variables Variables 
School n I P C £ е 
І 515 1 1 2 1 j 
H 416 4 5 4 3 4 
HI 305 5 4 5 4 5 
IV 274 2 2 1 2 2 
V 314 3 3 3 5 3 

TABLE IX 
'S RATINGS WITH 
CORRELATION OF THE PRINCIPAL'S 
COEFFICIENTS Co prr RATINGS ON THE INCLUSIVE, PRECLUSIVE, 
ныз AND CONJUNC TIVE VARIABLES, FOR 28 TEACHERS 


a a 
" Ip ig 


No. of š = 
School Teachers TJ, I2 Tp, P2 
AUTE Pa .40 


I 10 
5 -.49 .01 -.54 
м 48 -.29 
1 .33 . 
» 36 , 83* 
6 mu! . 
V 
28 


Total 
i indicates pupils' rat- 
а tes principals” ratings; subscript 2 indic pup 
Agupscript 1 indica e 
ings. 
*Significant at the ‚05 level. 
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TABLE X 


COEFFICIENTS OF CORRELATION OF TEACHER'S ESTIMATE OF PUPILS' RE- 
QUIRED AND SELF-INITIATED WORK WITH PUPILS' 
SCORES ON THESE VARIABLES 


———BÉ.—— v — . —Á@ ka. əƏƏ—  —— >-.b 


No. oí a a 
School Subject Teacher Pupils TRIS БЕТҮ 
I Eng. 3 48 .41* .46* 
I Eng. 4 53 .27* .58* 
I Eng. 6 48 . 23 .30* 
І Eng. 7 22 .27 a 
I Eng. 8 55 .14 . 48* 
I Arith. 2 80 .23* „23* 
I Arith. 5 62 217 .22* 
T Arith. 10 33 .26 .20 
I Eng. 13 110 .14 .13 
II Eng. 14 93 .09 .15 
H Arith. 11 88 . 05 437 
II Arith. 12 125 ,15* .00 
ш Eng. 16 24 .22 : 2i 
HI Eng. 19 tL . 82% .40* 
ПІ Arith. 15 13 -.22 ‚ 11. 
ш Arith. 17 75 .23* .10 
HI Sci. - 18 112 -.01 .68* 
IV Eng. 21 20 „11 .42% 
IV Eng. 22 18 .43* .57* 
IV Eng. 26 4 -.16 . 00 
IV Eng. 27 21 „52* .21 
IV Arith. 20 21 .40* .38* 
IV Arith. 24 53 .25* .22 
IV Sci. 25 98 .16 .48* 
V Eng. 28 18 . 60* .42* 
у Eng. 31 14 .71* ‚64* 
V Eng. 23 42 -.05 -.02 
V Arith. 32 69 .56* .94* 
V Sci. 30 53 .27% .95* 
eee 
Number of significant r’s 14 16 


z Subscripts t and p indicate teachers’ and pupils’ estimates, respectively. 
* Significant at .05 level. 
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cantly related to the pupils’ own estimate 
of their work. 

4. The relationship of the preclusive behav- 
lors to the work scores is not clear, the 
evidence being inconclusive. 

5. Strong evidence is adduced to show that in 
the perception of the pupils, Scores on in- 
clusive and conjunctive behavior of tea c h- 

ers are related to scores on the perform- 

ance of required and self-initiated work of 


pupils. 


Some Implications of the Research 


this study may be of 
ducation, to persons 
and to psycholo- 


_ The results observed in 
interest to researchers in e 
involved in classroom practice, 
gists. 

For educational researc 
Some value in the stability and discr iminative 
аа of the pupils’ scores on items of teacher 
ae avior. These items seem to have the virtues 

specificity and clear definition. They may 
prove useful in studies of classroom competence. 
It is also perhaps worth noting that the concept 
of amounts of pupil work as variables intervening 


closely before pupil change may be productive, 
is for confidence in the 


hers there may be 


pupils’ reports of their own work. 
There is some poss the findings of 
imited applicabil- 


ity in the classroom, Since most educators are 


i C Nay: Furthermore, the 
and between the inclusi 
b the self-initiated wo 
€ of some moment to educat 
reliance upon theories of education in 
Pupil’s interest, his self-reliance, i 
and his self-initiated activities play 
a role, 
of buen finally, at least two addit La 
chol is research may hold some interest for psy 
eee The first is that 
and D is found that would tendto reinforce 
ents ollard's hypotheses relating to the SU is 
that of approach and avoidance. The secon 
ps the/research may otier soms clues for e 
p rpm od concerned with the relations ip 1-4 
нас personality variables to o vert c 
om behaviors of teachers- 


FOOTNOTES 
* 
This article is based on 4 doctorate arm 
prepared at the Harvard Сга шї а of 


of Education. A detailed рг esenta i 

` i s 
the theoretical structure upo hich the 
present analysis is pased is av 


m 


2. 


10. 


11. 


12. 


13. 
14. 


15 
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article entitled ** Theory and Design of a 
Study of Teacher-Pupil Interaction," ap- 
pearing in Harvard Educational Review 

XXVI (Fall 1959), pp. 319-12. : 


N. E. Miller and J. Dollard. Social Learn- 
ing and Imitation (New Haven: Yale Univer- 
sity Press, 1941). 


H. A. Murray. Explorations in Perso 
ersity Press, 1941). 


(New York: Oxford Univ 


Murray, op. Cit. 


K. Lewin and others. “Patterns of Aggres- 
sive Behavior in Experimentally Created 
‘Social Climates’, ” Journal of Social P s y- 


chology, X (May 1939). 
«gtudies of 


H. H. Anderson and others. 
onalities, HI, z: 


Teachers’ Classroom Pers 
sychology Monographs, No. 


Applied P 1 
(Stanford: Stanford University Press, 1946). 


and Measure- 


. R. В. Cattell. pescription 
ment of personality (New York: World 


Book Co. ; 1946). 


The description of the methods and findings 
of the trait analysis is presented in the 
second article, entitled.“ The Relation of 
the Behavior of Teachers to the Produc- 
tive Behavior of Their Pupils: Il. ‘Tra it 


Analysis'," in this Journal. 


tted and are present- 
hapter V of the or- 
iginal doctorate dissertation at the Har- 


vard Graduate Sc 


tatistical Methods in Re- 


‚Р. O. Johnson. 5 
rk Prentice-Hall, 1949), 


search (New Yo 
рр. 51-8. 

Eg. F. Lindquist. Statistical Analysis in Ed- 
ucational Research (Boston: Houghton Mif- 
Тіп Co. , 1940), рр. 210-11. 

These tables may be found in Appendix D of 
the original thesis, рр: 171-74. 


d by Q. McNemar, Psychological 


As presente РЕ а а 
к: John Wiley апа Sons, 


Statistics (New Yor 
1949), P- 96, 


Ibid., p. 322. 


ercorrelations not reported in 


. For all the int. 
tables are available in Ap p en- 


this article, 
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dix D, pp. 171-74, of the author's original 
dissertation. 
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THE BEHAVIOR OF TEACHERS AND THE PRODUCTIVE 
BEHAVIOR OF THEIR PUPILS 
IL “TRAIT” ANALYSIS" 


MORRIS L. COGAN 


Harvard University 


ВЕ у ТО appraise the competence of 
iini s have commonly utilized one of two cri- 
is ine measures. The first of these is frequent- 
Sd en in terms of the opinions of principals, 
Bor iei a and experts as tothe effectiveness 
codes ca ish The secondis, commonly, some 
i5 mate of pupil change, or growth, or achieve- 
T ent. Both paradigms seem to have serious 
imitations in their applicability to research pur- 
poses. The opinions of persons considered com- 
petent to make judgments of the effectiveness of 
а teacher achieve acceptable reliability only 
with great difficulty; they are also, almost trans- 
i too far removed from the logically de- 
E criterion of pupil change. On the other 
s , the attempt to measure pupil change also 
Pae certain stubborn 
im are lacking to measure many th 
е ontani of pupil changes; Ìt is, inaddition, ex- 
а difficult to isolate ће influences of Spe- 
ni 1с teachers upon pupils inprograms of depart- 
entalized instruction. 
of fen present study seeks an an 
ent coma methodological problems: 
PRSE ттан are taken in terms of SP 
ye le teacher behaviors, avoiding recourse 
tak. PEL opinen the dependent measures ге 
s in terms of two measures of P npud wara 
Y. are considered to intervene just prior, d 
ily теа ar which can be more satisfac o 
sured than upil change. i 
ies s independent variables of teacher bd 
ive re called inclusive, preclusive an са 
Pupil The inclusive behaviors te” P 
Sion S central to the teacher's classroom А 
: S and to the teaching-le arning MPC ties 
t Uns feel that their £03 5, their арпа ч 
ar needs are taken into Ipo п! S88 
eachers exhibit behavior? 


swer tO some 
The independ- 
ecific, ob- 


ive, affiliative, nurturant. Preclu- 
sive behaviors of teachers tend to make the pupils 
peripheral to classroom decisions and experiences, 
the pupils feel that their needs, their goals, their 
abilities are frequently overridden by other con- 
siderations; the teachers exhibit behaviors that 
may be termed dominative, aggressive, rejectant. 
Conjunctive behaviors are those behaviors of the 
teacher which give evidence of (1) his skill in 
classroom management, (2) his ability to com- 
municate with the pupils, (3) his command of and 
ingenuity in dealing with the subject matter, and 
(4) the level of his demands upon the pupils. 


The dependent variables are two measures of 
(1) the amount of re- 


the pupils' productivity: 
quired work performed by the pupils, and (2) the 
amount of class-related se 


formed by the pupils. 
i on data collected 


The study 
from the pupils themselves by means of à ques- 
tionnaire called the «Pupil Survey". Scores for 
each of the three teacher variables are derived 
from pupil ratings of the frequency with which a 
specified teacher performs certain actions. The 
ecured by pro viding 


roductivity scores are 5 
the pupils on which they report the 


h which they perform certain com- 
red assignments and engage in various 
ed activities in connection with the 
ecified classroom. Each pupil is 
t on two of three teachers (in Eng- 


termed integrat 


work scores by having 
the two kinds of wor 


f this research i 
y are in an excellent position to observe 


:пеа “Theory and Design of a Study of 


first is entitle 
he first iS ©" xvi (Fall 1956), pp. 315-42; the sec- 


Their Pupils: I. ‘Percep- 


* 
This i e 
тег S the third article in a ser “cational Ё х 
cher-Pupil Interaction, ”” in Harvard Ей ihe Productive Behavior of 
achers * ue of the Journal of Experimental Education. 


Ond i 
tig, 15, entitled "The Behavior of Te 
Analysis,” which is published in 
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their own work and the behavior of theirteachers. 


Some representative hypotheses state that the 
teachers' inclusiveness (I) scores will be posi- 
tively related to the required work (R) sc ores 
and to the self-initiated work (S) Scores. A sim- 
ilar hypothesis is offered for the conjunctive (C) 
variable. The preclusive (P) behaviors of the 
teachers are hypothesized to be negatively relat- 
ed to the required and self-initiated work scores 
of the pupils. 

The sample includes 33 teachers, five princi- 
pals, and 987 eighth-grade pupils in five depart- 
mentalized junior high schools. From these pu- 
pils 1786 usable surveys were Obtained. The 
schools are located in two urban com munities 
(“А and "By. having appreciably different 
Socio-economic Characteristics. The population 
of Community A has a rather high ave rage in- 
come and educational level. A relatively large 
proportion of the population is employed in pro- 
fessional or managerial positions, The persons 


relatively fewer 


Analysis of the Data 
— Ol the Data 


This paper is concerned with the “trait” an- 
alysis of the data. 


tively as a sample of teachers. The basic data 
are the average scores of a group, “group” be- 
ing defined as all the pupils in the sample who 
have reported on their work With a specified 
teacher. Most of the pupils we 
plete a Survey for two of their teachers, 


Analysis of Variance 
—— — Variance 
At the outset 


Swers to certai 
variables: 


; it is useful to determine the an- 
n questions concerning the five 


meaningful classification? 

2. Are there school differences in the scores 
on the variables, or may the schools be considered 
as random samples from a single population of 
schools ? | 

3. Does the community constitute a meaning- 
ful unit of analysis, or may the communities be 
considered as random samples of com munities 
from a single population? 

4. Do the means of the criterion variables 
3erve to differentiate among teachers of the same 
subject? 


These questions and others of related charac tgi 
are examined by means of variance analysis. 
Since the numbers of cases are not equal in diiter- 
ent categories of classification, only the simplest 
analysis of variance model is used. Certain im- 
portant questions are left to be answeredlater by 
the covariance adjustment, 

For eachof the following computations the num- 
ber of degrees of freedom for the sum-of-squares 
among the units of the classification is one less 
than the number of units, For the sum-of-squares 
within units of the classification, the number of 
degrees of freedom is equal to the number of pu- 
pils minus the number of units of classification. : 

The first step in the variance analysisis to ex 
amine the hypotheses that the means of the Scores 
of the groups are equal for each of the variables. 
The F-test is applied to the scores of each group, 
and the null hypothesis is: 


where м is the mean score oÍ a group on a given 
variable for the teacher indicated by correspond- 
ing subscript. The results of the com putations 
are presented in Table I through Table V. 

It is to be noted that there was a large and A 
ten complete overlap of members hip in the 3 
groups. No allowance could be made for this in 
the computation of the F-ratios over groups. The 
Significance obtained would be an under - estimation 
if pupils scoring high on their own work or i 
their teacher’s behaviors in one subject also ten 
ed to score high on these vari ables in ИШ 
Subject. (It may be recalled that most pupi F 
completed two ‘‘Surveys’’.) If, on theother han 1 
the homogeneity thus hypothesized did not бс 2 
and high scores in one subject were accompanie f 
by low scores in the other, asif by some ‘ам sa 
Compensation", then the level of s ш Ынаш 
Would be an over-estimation. The occurrence 0 Д 
this latter phenomenon is considered highly un 
likely, especially in view of the results of Rue 
analysis of the differences among the group e 
within each subject area, where the groups ar 
non-overlapping. 

The hypothesis that the means of the scores 
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TABLE I 


ANALYSIS OF VARIANCE OF GROUP MEANS OF INCLUSIVE SCORES 


Sum of 
E P 


Source of 
Squares Variance 


Variation df 


Among Groups 32 198,105. 28 6,190.79 26.41 <.01 


Within Groups 1753 410, 920. 13 234.41 


Total 1185 609, 031. 41 


TABLE П 


ysIS OF VARIANCE OF GROUP MEANS OF PRECLUSIVE SCORES 


ANAL 
Sum of 
F P 


Among Groups 32 208, 712. 98 6, 522. 28 22.84 zt 
Within Groups 1153 500, 569. 38 285.55 
Total 1785 109, 282. 36 


TABLE ш 


g oF GROUP MEANS OF CONJUNCTIVE SCORES 


IS OF VARIANC 


ANALYS 
Source of sum of ч 
Variation df squares Variance F Р 
. 48 3, 588. 95 15.84 <.01 
Among Groups 32 114, 846 
a 226. 60 

Within Groups 1753 397,231. » 

512, 078. 38 


Total 
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TABLE IV 


ANALYSIS OF VARIANCE OF GROUP MEANS OF REQUIRED WORK 


of Sum of Б 
care Mi df Squares Variance F 
; 5,01 
Among Groups 32 141, 363. 92 4,417.62 17.76 
Within Groups 1753 435, 887. 49 248. 65 
Total 1785 577,251.41 
TABLE V 


ANALYSIS OF VARIANCE OF GROUP MEANS OFS 


ELF-INITIATED WORK 
Source of 


Sum of p 
Variation df Squares Variance F 
Among Groups 32 68, 535. 86 2,141.75 9.40 ie 
Within Groups 1753 399, 470. 86 227.88 
Total 1785 468,006. 72 
TABLE VI 


ANALYSIS OF VARIANCE OF SCHOOL ME 


23,495. 66 81.25 <.01 


1781 515,048. 77 289.19 
Total 


1785 609,031.43 
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on each of the five variables for all groups 

equal is rejected. The Scores derived dora the 
surveys discriminate among teachers. This is 
tantamount to saying that the pupils report differ- 
ing levels of the three kinds of teacher behavior 
and of required and self-initiated work for differ- 
ent teachers. 

_ The next phase of the analysis is an examina- 
tion of the hypotheses that there are no dif- 
ferences among the school means of pupils on 
the variables. The results of this analysis are 
summarized in Tables VI through X. 

All the F-ratios are significant, and the null 
hypotheses regarding school means are rejected 
(P<.01). In terms of the simple analysis of var- 
iance, the schools vary significantly, and they 
cannot be considered as random samples from 


the same population of schools. 

(Perhaps an anticipatory note is indicated at 
this point. Until an analysis of the hierarchical 
relations existing among the three levels of clas- 
Sification—teachers, schools, and communities 
—is made, it should be kept in mind that signifi- 
cant differences observed at any one level may 
merely be reflections of differences found at an- 


other level.) 
The question as to wheth 
ences exist between the means of the variables 
in the two communities is examined next. With 
the possibility in mind that the socioe C onomic 
level of the community might prove tobea signif- 
icant factor in the measurement of the teacher 
variables and pupils’ productivity, : 
made to maximize socioeconomic differences in 
the selection of communit 


er significant differ- 


marized in Table XI throug 
1 The communities vary 81 
evel in scores on the inclusive an 
works variables. The differences b 
s. on the preclusi 
Сс аге significant at th 
ated S are no longer signif 
as Variable. Since the . з 
s significant prior to the analysis, it may 
в that acceptable signific ant di ffe rences 
uus communities are limited to two vari 
н ан community level in the hierarchy of cla: 
oss ср of this study, there is evi s 
t me distinctions, apparently quite sharp 2 
— and school level, ar ing less acute, 
a strong evidence pres 
tends to corroborate thi 7 

o conclusions are drawn at present, t 
economi to the relationship between 

e в factor and ће community Š eet 

oo EE ences 2020 

agne out the analyses thus far 
Small ude of the F-ratio has con 
Work er for self-initiated work than for 

» although the ratio becomes поп-81 


etween com- 
j tive varl- 


only at the community level. There would seem 

to be some evidence to indicate that the amount of 
self-initiated work varies less within the units of 

classification than does required work. Of the in- 
dependent variables the sharpest variances are to 
be found in the inclusive score s, descending 

through the preclusive and conjunctive scores in 

order. 

Finally, the analysis of variance is computed 
for each subject area. The question to be an- 
swered is whether there still exist differences 
among the several group means of the variables 
within a given sub-class of teachers, i.e., those 
instructing in English, arithmetic, or science. 

Two special conditions applying to these sub- 
classes deserve mention. First, the subject clas- 
sifications cut across all schools and communities, 
with the exception of science (which is limited to 
thin a single community. The 
stitute a fourth 
rchy already de- 
Second, since the members hips of the 
roups within each subject class do not overlap, 


g 
the qualifications as to over-lapping member- 


ships that were appended to the analysis 
There is, then, no 


groups do not apply here. 
reason for taking into accountthe possible effects 
of the shared membership 


nificance of observed difference 
Table XVI through Ta- 


within each subject area. 
ble XX summarize the results of the variance an- 
alysis. 

All the F-ratios for group теа! 
within each subject area are significan 


level. 
In terms of the re 


pil Survey", these con 


ns of the scores 
t at the .01 


s of pupils on the “Pu- 


sponse 
m warranted: 


clusions see 


1. The simple analysis of variance without co- 

variance adjustment and without reference 

to the impact of levels of the hierarchy up- 
on each other shows significant differences 
in the means of all the variables for all 
groups together, for allgroups ineach sub- 
ject area, and for all schools. 


2. In the analysis of variance between commu- 

nities, only the inclusive andrequired work 

scores achieve significance at the .01 
level. 


3. The teachers appear to be significantly dif- 
ferent with respect to traits of inclusive- 
ness, preclusiveness, and conjunctivity 
when characterization of the teachers is in 


terms of the pupils’ reports. 


4, The pupils report differing amounts of re- 
quired and self-initiated work for different 


teachers. 
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TABLE VII 


ANALYSIS OF VARIANCE OF SCHOOL MEANS OF PRECLUSIVF SCORES 


Source of Sum of 
Variation df Squares Variance F P 
Among Schools 4 78, 307. 53 19,5776. 88 55.26 <. 01 


Within Schools 1781 630, 974. 83 354. 28 
Total 1785 709, 282. 36 
TABLE VIII 
ANALYSIS OF VARIANCE OF SCHOOL MEANS OF CONJUNCTIVE SCORES 
Source of Sum of 
Variation df Squares Variance F P 
Among Schools 4 23,153.76 5, 788. 44 21.09 <.01 
Within Schools 1781 488, 924, 62 274. 52 
Total 1785 512, 078. 38 
TABLE Ix 


Source of S 
dora u 
Variation ы 


Among Schools 4 31,877.13 7,969.28 26.02 <.01 
Within Schools 1781 545,372.81 306.22 
Total 1785 


577,249. 94 
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TABLE X 


ANALYSIS OF VARIANCE OF SCHOOL MEANS OF SELF-INITIATED WORK 


s GERMEN сыш а 


Sum of 
F 


Source of 
Mice af Squares Variance 
Among, Schools 4 17, 803. 43 4, 450. 86 17.61 E 
Within Schools 1781 450, 140. 29 252.75 
Total 1185 467, 943. 72 
TABLE XI 
ANALYSIS OF VARIANCE oF COMMUNITY MEANS OF INCLUSIVE SCORES 


Sum of 
Variance 


Source of 
Variation 


40. 38 <.01 


Between 
Communities 1 13, 480. 86 13, 480. 86 
Within 
3.83 
Communities 1784 595, 549. 57 33 
609, 031. 43 


Total 1185 


TABLE хп 


NITY MEANS OF PRECLUSIVE SCORES 


ANALYSIS OF VARIANCE OF COMMU. 

P dis Sum of | 
Variation а squares Variance F ë 
= 1. 642.11 ee à 
Communities 1 1,642. 11 ‚642. ; m 
Within 
Communities TT 707, 640. 25 396. 66 

Total 1785 709, 282. 36 
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TABLE XIII 


ANALYSIS OF VARIANCE OF COMMUNITY MEANS OF CONJUNCTIVE SCORES 


Source of Sum of | р 
Variation dt Squares Variance F 
Between 
Communities 1 1,319.46 1,319.46 4.82 <.05 
Within 
Communities 1784 510, 698. 92 286. 27 

Total 1785 512,078. 38 


Е Oá = > ht НЕ 


TABLE XIV 


ANALYSIS OF VARIANCE OF COMMUNITY MEANS OF REQUIRED WORK 


=к=——=———————Є—Є—ЄЄ—Є—Єї—Є—Єї—————ЄїЄ—Єї—ї———Є————Є———————————————— 
Source oí 


Sum of 
Variation df Squares 


Variance F P 
Between 
Communities 1 10, 879. 76 10, 879. 76 34.21 <.01 
Within 
Communities 1784 566, 370. 18 311.41 
Total 1785 577,249.94 


=— FP = ee ————=—=* 


TABLE XV 


ANALYSIS OF VARIANCE OF COMMUNITY MEANS OF SELF-INITIATED WORK 


rie ee 
Source of 


ed Sum of 
ariation df Squares Variance F F 
Between 
Communities 1 659.88 659.88 2.52 2405 
Within 
Communities 1784 467,283.84 261.93 

Total 1785 467,943.72 


-— RP 


u—— 
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TABLE XVI 


ANALYSIS OF VARIANCE OF GROUP MEANS OF INCLUSIVE SCORES 
WITHIN EACH SUBJECT AREA 


Source of 


E Sum of 
Variation df Squares Variance F P 
unc ue NM o oc =з к=к _. У 
English 
Among Groups 17 91, 438. 51 5, 387. 74 9.78 <.01 
Within Groups 747 410, 926.13 550.10 
Total 764 502, 364. 64 
жй ==. > = Ж IE 
Arithmetic 
Among Groups 10 79, 888. 56 7, 988. 86 36. 33 «.01 
Within Groups 695 152, 823.18 219.89 
Total 705 232,711.74 
ee CL 
Science 
Among Groups 3 13, 394. 05 4, 464. 68 16.19 <.01 
Within Groups 311 85, 781.14 275. 82 
Total 314 99,175.19 


кым eet ph Eg MM ener. 


TABLE XVII 


ANCE OF GROUP MEANS OF PRECLUSIVE SCORES 
WITHIN EACH SUBJECT AREA 


ыыы =н 


ANALYSIS OF VARI 


Sour Sum of | ө 
Ушан а Squares Variance F Р 
Engli 
a Groups 17 86, 345. 88 ecc 22.65 <.01 
Within Groups 747 167, 480. 66 . 

Total 764 253, 826. 54 
а 
E ` .6 

75 А s 10 88, 120. 85 8, gem 26. 68 «.01 
Within ay 695 229, 551. H 
Total 705 307, 672. 
b — 8,950. 27 26.88 «.01 
Winns Groups : m a 332.92 
Within Groups Eh he m 
Total 314 130, 


eS 
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TABLE ХУШ 


ANALYSIS OF VARIANCE OF GROUP MEANS OF CONJUNC TIVE SCORES 
WITHIN EACH SUBJECT AREA 


—— 
Source of Sum of 
Variation df Squares Variance F Р 
English 
Among Groups 17 72, 989. 31 4,293.49 18.95 <.01 
Within Groups 747 169, 287. 94 226. 62 
Total 764 242, 277.25 
pe a — >F = g< x= 
Arithmetic 
Among Groups 10 25,050. 65 2,505.06 12.35 <.01 
Within Groups 695 140, 972.72 202.84 
Total 105 166,023.37 
>s = e nA a' f Í 
Science 
Among Groups 3 13,347.42 4,449.14 15.91 <, 01 
Within Groups 311 86,971.24 279.65 
Total 314 100,318.66 


ee a = wasa 


TABLE XIX 


ANALYSIS OF VARIANCE OF GROUP MEANS OF REQUIRED WORK 
WITHIN EACH SUBJECT AREA 


с——Є—Є&—Єү—Є—Є—Є—Є—Є&Є&єЄ&Є&—Єү—Є—Є—Єүу&у%—є—є—Є—ү—ү—ү—ү—Є—Є{Є{Є{]ү]ү]—Є—Є&{]ү—————————————————— 
Source of 


"ce Sum of 
Variation dí Squares Variance F P 
English 
Among Groups 17 45, 278. 69 2,663. 45 9.21 OL 
Within Groups 741 215, 921.08 289.05 
Total 164 261, 199.77 
Arithmetic 
Among Groups 10 52,252.71 5, 225.27 31.14 <.01 
Within Groups 695 116,623.10 167.80 
Total 705 168,875.81 
Science 
Among Groups 3 14,386 37 
mon , 386. 4, 195.46 14.43 <.01 
Within Groups 311 : 
Жош Аза 103,343.31 332.29 


117,729.68 
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TABLE XX 


ANALYSIS OF VARIANCE OF GROUP MEANS OF SELF-INITIATED WORK 
WITHIN EACH SUBJECT AREA 


Source of Sum of 
Variation df Squares Variance F P 
English 
Among Groups 17 28, 382. 69 1, 669.57 6. 21 «.01 
Within Groups 741 198, 283.12 265.44 
Total 164 226, 665. 81 
Arithmetic 
Among Groups 10 12, 371.79 1, es is 7.95 eae 
Within Groups 695 108, 132. 34 199 
Total 705 120, 504.13 
Scien 
тнай? Groups ° E, m 1р 6 is | е = 
Within Groups 311 , 992. ° 
ithin Groups ^m 96, 930. 16 


Total 


Teachers in 


Teachers in 
Community B 


Community 


Unadjusted F Value* 


Adjusted F Value* 


= 
All significant at . 01 Tevel. 
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lem arises as to the treatment of observations 
belonging to teachers who belong to schools which 
in turn belong to a community. Professor John 
B. Carroll of the Harvard Graduate School of 
Education has Suggested what may be anew de- 
parture in the treatment of such problems. His 
description of the statistical method follows:* 


Covariance Adjustment 
S jus tment 


As will be seen ina later subsection of this 
article, there is а positive correlation between 
each of the criterion variables and the inclusive 


“There is a Possibility that a new application 
of analysis of variance may be of some utility in 
dealing with hierarchical classifications. Suppose 
One has N observations which belong to Classes 
Ai, А,..., Aa which in turn bel ong to classes 
Bi, Вг, ..., Bp (b<a), which in turn belong to 
classes C,, C, ... › Се, (ecb), ete. (It would 
be preferable that these inequalities should be 
b< 1/2a, c< 1/2b, etc. ; in order to insure that 
each class have at least two sub-classes.) The 
number of observations in any class may be any 
number equal to or greater than two, and the class- 
es may be of unequal size. We shall speak of the 
various classifications A, B, C, ..., M as differ- 
ent levels of classifications. 

The following specification equation can be es- 
tablished for the jth observation: 


uncontrolled differences, 
When the covariance adjustment is made for 


@) (Xi - Xt) = (xi - д) + (ai - Xp) +--+ 


Sented in Table XXI. мі - Xv» 
Тһе following observations may be made con- 

cerning the data in Table ХХІ. The results are where X, is the grand mean for all Nobserva- 

not unanticipated. The theory of this res earch tions, and XAi, XBi,...,XMi are the means of T 

does not hypothesize that differences in the group the A-, B-,..., M- classes in which thegtiobser 


vation is found. 

If we square and sum both sides of equation (1) 
over all observations, all the cross-product terms 
vanish, and hence 


N = N _ NX <=... ya 

(2) Z(Xi - Ky? = 2(Xj - XA)? + У(Хді - XBi) 
N. Ex 

+... + (Ху - Xt) 


Therefore each term on the right -hand side of (2), 
when divided by an appropriate number of degrees 
of freedom, yields an inaepenaent estimate of pop- 
ulation variance. These estimates may be com 
Pared by the usual F-ratio procedure in ened 
hypotheses regarding the equality of populati fie 
variances associated with eac h level of pue 
Cation. The resulting analysis of variance tab 
[is shown at the tope of page 119]. ” 


The analysis of variance by the hierarchies = 
the data of this Study, following Profess А 
Carroll’s model, is presented for all variable 
on the pages following. pe 

From Tables XXII through XXVI, it may 


nalysis of these data, a prob- 
*This new application of anal 


M á А " et 
available in other publications of variance was developed in connection with this paper and is not y 
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(Professor Carroll's Analysis of Variance Table) 


Variance 
Estimate 


Source of Variance 


By aivision 


N х 0\2 
Between M-classes Mi - Xp) 


N SF. Y. m 
Between B-classes z (XBi - хс)" b-c By division 
a-b By division 


N _ = 
Between A-classes > (ХАі - Xpi)° 
By division 


N = 
Within A-classes У(Х - Kai)” 


Total 


TABLE XXII 
HIES, FOR INCLUSIVE SCORES 


ANALYSIS OF VARIANCE BY HIERARC 


Variance 
Sum of Estimate F P 


Source of Mus 
quar 


Variation 


Between 1 13, 481. 86 
Communities 13, 481. 86 .50 > .05 
Among 3 26,833. 60 
Schools 80, 500. 80 1.22 «.01 
Among 28 3,718. 66 
Teachers 104, 122. 62 15.86 <.01 
Within 234.41 
1153 

Teach 410, 926. 13 

SER 341.19 


Total 609, 031- 41 
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TABLE XXIII 


Source of Sum of Variance 
Variance Squares af Estimate F P 
Between 
Communities 1,642.11 1 1,642.11 
.06 ».05 
Among 
Schools 76, 665, 42 3 25,555.14 
5.49 < .01 
Among 
Teachers 130,405.45 28 4, 657.34 
16.31 < .01 
Within 
Teachers 500, 569. 38 1788 285. 55 
Total 709,282.36 1785 397.36 


Re, 


TABLE XXIV 


ANALYSIS OF VARIANCE BY HIERARCHIES, FOR CONJUNCTIVE SCORES 


Source of 
Variance 


Sum of 


Variance 


Squares af Estimate F P 
Between 


Communities 


1,379.46 1 1,379.46 
».05 
Among m 
Schools 21,774. 30 3 7, 258.10 
ts 2.22 > .05 
Teachers 91,692.72 28 3, 274.74 
Within m mE 
Teachers 397,231.90 1753 226.60 
Total 512, 078. 38 1785 


286. 88 


a} 
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seen that the F-ratios attain significance for 


band variable in the comparison of the am ong- 
iia means with the within-teacher (actually 
among-pupil) means. 
The hierarchical comparison petween teach- 
ers and schools yields F significant at the .01 
level only for inclusive and preclusive Scores: 
бы ДЕ of the variables attain significant F’s in 
я mparison of school variances with commu- 
nity variances. 
Mies the exception of the preclusive and inclu- 
"a variables compared by teachers and schools, 
only the among-teachers with within-tea ch ers 
(among-pupils) comparison remains significant 
after the correction provided by the analysis Of 
variance for hierarchies. It appears that consist- 
ent and genuine differences observed are those 


among teachers, and the analysis gives evidence 
es noted among the 


the simple analysis 
ep tions note! 
ore no more than would be expected in view of 
бо differences among teachers- 
n gives some ground for saying that schools 
and communities do differ, put that the differ- 
AE attributable to teac 
— € of intrinsic schoo 
re es. It seems fair to C 
s on the “Survey” serve to a 
inis e among teachers on 2 he variables of 
fare study, and that the obser 
ties nces among schools and betwee 
den need to be qualified in the light © Lii 
va ce provided by the hierarchic analysis o 
riance. 
ч. these findings аге kept 
the aps be said that the 50° 
irae 3 can be viewe | 
an of work reported by the p pils ee 
am munity to the degree t the d Iferen 
(luct d teachers employed 
t that socioeconomic level- 


n communi- 


in mind, thenit may 
ic level o 


S 
eee Error of Measurement and 
Reliability of the Scales 


In the i rana am: © 
vari preceding computations 
eae within groups was der ived or e ach 
p. Th his error 
e square root of t or for each 


Provid 
es a measure of standar 
9 i " or 
Au five scales. The reliability coeffic pe 
of a scale is also computed for e erabe ae 
effi teacher of a group. The form for i 
cient of this average score py a group 
z 
-— 
peisze 
ба 
variance es- 


Whe 
. ere sub: А те 
im Юѕсгіріѕ w and a denote ers 
ate within-teachers and the 27^ ong-teach 


ved 5 ignificant dif- 
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estimate respectively. Table XXVII presents the 
results of the computation. 

The reliability coefficients for group assess- 
ments are quite substantial. There is some rea- 
de that the five scales furnish reli- 


son to conclu 
able measurements of the teacher traits and of 


pupil productivity when these measurements are 
computed from all the scores of the pupils report- 
ing on a given teacher. In view of these findings 
it would appear that the scales of the ‘‘Pupil Sur- 
vey” may be used with some confidence for the 
characterization of teachers interms of the traits 


examined in the course of this research. 


The Correlational Analysis 


The variables have been subjected to two kinds 
of correlational analysis: (1) the “perc eption" 
analysis, and (2) the trait analysis. The results 

are reported in detail 


of the perception analysis 
in another article. It may pe relevant, however, 
ndications of the major elements of 


mputed for all of the var- 
iables with each other within each group. Thus, 
for each com i iables there were 33 
coefficients. Each of these arrays of coe ffi- 
cients was tested for significance 

the sign Or the binomi 
h the prese 
he relationships between the members 
ed variables, without reference to the 
magnitude of the trend. The t-test was then ap- 
ied in order to obtain an estimate of the signif- 
ionships between the paired 
group. In one sense these 
viewed as 33 replications of 
n, with the significance of the coef- 
i tablished for each admin- 


istration (ies, 
The results were inconclusive for the preclu- 
ith both the required work and the 


that the inclusive and the 
re each related to each 
he relationship of the inclu- 


ibiting the greatest stability 
cients attained significai 


level). 

ines as to the correlation of the inclu- 

i and self-initiated work 

scores suggested a similar examination for the 
ingle sample and for the pupils 

ately. In order to avoid 
the spuriousness that might result from throw- 
ing unlike means, ther's 

from the within-sums, which had 
ed in the course of an analysis of var- 
riance adjustment. The total r's 


jance with cova 
prs and the within-group r's 


the petween-grow 
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TABLE XXV 


Source of Sum of Variance p 
Variance Squares af Estimate F 
Between 
Communities 10, 879. 76 1 10, 879.76 
1.56 > .05 
Among 
Schools 20, 997. 37 8 6,999.12 
1.79 > 05 
Among 
Teachers 109,486.79 28 3,910.24 
5.73 < .01 
Within s 
Teachers 435, 8877, 49 1753 248. 65 
Total 577, 251. 41 1785 323. 39 


a EM Nene M 


TABLE XXVI 


ANALYSIS OF VARIANCE BY HIERARCHIES, FOR SELF-INITIATED WORK SCORES 


Source of 
Variation 


Sums of 
Squares 


Variance 


———À 


df Estimate F P 


Communities 


Among 
Schools 


Among 
Teachers 


Within 
Teachers 


Total 


659. 88 


17,143.55 


50, 732. 43 


399, 407. 86 


467, 943.72 


1 


659. 88 
0.12 

5, 714. 52 
3.15 

1,811.87 
7.95 

2271.84 

262.15 


cci E a NEN 
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TABLE XXVII 


ASUREMENT AND RELIABILITY COEFFICIENTS OF THE SCALES 


STANDARD ERROR OF ME 


Range of Scores 23 to 119 

Variance Estimate 234.41 285.95 226. 60 

Standard Error 15.31 16.90 15.05 15.77 15.09 
Reliability Coeffi- 

cient for Group 962 937 ‚944 ‚894 


Assessment 


TABLE XXVII 
s OF INCLUSIVENESS WITH REQUIRED WORK SCORES 
сона дтон COREA TED WORK SCORES; FOR THE TOTAL SAMPLE AND 
7 FOR EACH COMMUNITY 


Total Sample < 

Total Sample à 

Total r Pigs y 2.01 

Among Groups £ 31 " <.01 
Within Groups £ 1752 

Community À 926 38 

E y 36 <. E «.01 
Total 926 | 12 ‚63 
[o T 12 .64 <.01 30 <.01 

g Groups Z 913 <. . <.01 


Within Groups £ 
856 . 42 .01 


Community B 
Total r e .01 
Among Groups £ 1 

r 838 


Within Groups r 
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nted in Table XXVIII. 
uM 0 of Table XXVIII reveals that 
all the r's except the among-groups гүр of com- 
munity B aresignificantatthe .01 level. The 
single exception is significant at the .025 level. 


With 31 degrees of freedom for the among- 
groups r's of Table XXVIII, all these coefficients 
are significant at the . 01 level, It may be saia 
then that the average score S of the groups for 
teachers’ inclusiveness are positively Significant- 
ly related to the average scores of the groups on 
required work and on Self-initiatea work. There 
ce from scores on the 
Operational terms of 


productivity re- 
the conjunctive 


^ 


Summary and Conclusions 
Ey апе tonclusions 


The analysis of variance reveals that the five 
Scales are capable of differentiati ng sharply 
among groups (teachers), and that although those 
differences extend up through the two communi- 
ties, each taken separately as a single sample, 
and through the total sample population taken as a 
Single group, the genuine differences are those 
among the teachers, the other steps of the hier- 


archy merely reflecting the aifferences. The = 10 
nificant differences noted among the teachers, 


tinue to appear when the analysis of variance о 
performed among the teachers of each subjec 
separately. _ 
The reliability of each scale is quite substan 
tial, the coefficients ranging from _ 89 to . 96 for 
the five variables. TN 
The average scores on theteachers' inclusive 
ness are significantly related to the average pro- 
ductivity scores for both of the criterion m as 
ures of pupil work. It seems fair to conclude tha 


А Р #5 lu- 
as measured in this research, the teachers’ inc 


Siveness is an observable and measurable trait = 
teachers and that itis related to the amounts of t 
pupils’ required and self-initiated work scores: 
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THE USE OF FREE RESPONSE DATA IN 
WRITING CHOICE-TYPE ITEMS 


DESMOND L. COOK 
Purdue University 


PURPOSE 


Men TEXTBOOKS on educational m e a sure- 
E rules or suggestions for construct- 
sug i ei test items of various types- One 
deci SD often made is that the distracters 
be Plan responses) for choice-type items should 
bem usible. The application of this rule re- 
cernin that an item writer have some notion con- 
effici g what alternatives are likely to serve as 
writ ient distracters. For an experienced item 
ma ги this selection of plausible distracters 
n easy. relatively inexper- 
Md writer, however, may experience 
БО ty in devising plausible wrong choices. 
have me writers on educational test construction 
items ЕНЕ using students’ r SP onses to 
of po Set up in free response form as а source 
ers EN distracters. In general, these writ- 
А op that using student responses to free 
Choic nse items as a source of distracters for a 
than p Ra item often results in à petter item 
Hon data is not used. was 
been r ough the use of free response data has 
in edu €commended by many ач rs of textboo 
Menara measurement, rese4 3 
ativel ed to evaluate the procedure have been ге 
Sponse few. Kelley (1) in 1937 used free ze 
concluded à in constructing a vocabulary test an 
view of b that the procedure was questionabl 
a deria work involved andthe re? 
€ use sen and Satter (2) in 1953 rep 
arithm of free response data in const 
Some _ problems and found the method W fd 
rect c elp. They did not, however, P! an im 
and Wie arison between items constructed W 
e sa knowledge of such data. 
e esults of these two research Stuc 
ta iS ques- 


“St th 

tionable. the value of free response ae 
qty Be In addition, the procedure as ans 
vot = applied only to such factual school $ 5 
ae, ee and arithmetic, Where Jd 
qnie qe SER to денге incorrect answer tay 
evidence haw’ be determined, no €X Hp as 
wh ав Ды Denn obtained is “content” Apre- 

us 5 е social studies. thermo” = 


Anie c 
ke e eer s aoh T 


As the method of using 


free response data is offered as a general sugges- 
tion for all item writers, it is possible that the re- 
sults of previous studies have been largely deter- 
mined by the ability of a particular writer to use 
the free response data, rather than to the general 
value of the method itself. In view of the above 
considerations and the rather limited amount of 
the general problem, further study of 
lem seems warranted, particularly with 
differences in areas where 
s are not so easily categor- 


doing the item writing. 


this prob 
regard to item writer 
free response answer: 


ized. 

The purpose of this study was to determine 
whether or not choice-type items written with the 
aid ofinformation on student free responses could 

items written without 


be more discriminating than 

the use of such information. If items written with 
the aid of free response aata are more highly dis- 
criminating than those written without such help, 
the use 0: 


{ free response data wouldbe considered 
valuable. 


pROCEDURES 


Selection of Items and Writers 

of students! wrong an- 

ts, each composed of thirty free 

in the area of Contemporary Af- 

dministered as part of the 

freshmen testing program to 720 men entering 
ity of Iowa. in September 1953. 


and these § 
First, the su 
of question used to measure 
8 f it, were proadly represent- 
the st nc d in avariety of subjects 
f study at the college level. Results of an inves- 
ü ation in this area could be generalized more 
еу thani more restricted or spec ialized 
pes en chosen. 


area had be 
Second, since а sample o 


f trained item writ- 
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required to test the usefulness of free re- 
shed it was necessary to choose an area 
in which training in item writing rather than 
specialized subject matter competence coula 
be the determining factor in choice of item 
writers. С 

Finally, it was desired to choose ап area in 
which the tests could be administered under nat- 
ural conditions as part of a purposeful testing 
program involving a large number of well moti- 
vated students. It was also decided to conduct 
the testing under the close Supervision of the au- 
thor. These conditions seemed best met by ad- 
ministering a test in Contemporary Affairs to 
Students entering the College of Liberal Arts at 
the State University of Iowa. 

The three tests were Scored and item analy- 
Sis data obtained Separately for each test. The 
Sixty most discriminating items among the ninety 
were selected for final use in Studying the useful- 
ness of free response data. 

For each of the sixty selected items, the re- 
Sponses which had been marked incorrect in the 
original scoring were listed until ten different 
Wrong answers had been listed, or until all the 
responses to a particular item hàd been exam- 
ined. “All of the listing of incorrect answers was 
done by the writer. 

Wrong answers which expressed the same 
general idea were not listed Separately. Deci- 


egorize. The list of wrong answers to Item E. 


for example, is presented below. The number 
after each inc 


Answer: North Korea and China 


Fre- 
juenc 
Communist С š А 
Russia 
Korea, China, and Russia 
North Korea, Indochina 
hi 


hina and Communist 


wor ae 


North Korean 

Manchuria 
North Korea 
Russia and North Korea 
China and Korea 


Soldiers and also 


= мом 


China and Manchuria 1 
Soldiers from North Korea opposed 
the UN. Russia helped with sup- ў 
plies апа planes 


The use of free response data is offered as zs 
general suggestion to all item writers. It may 
however, that some writers are helped more а 
others by such data. То demonstrate that M 
method is of value as a general procedure, qe 
decided to use a group of item writers. Six цел 
writers were used in the present study, each ku 
ing twenty items, ten with andten without the Пер 
of free response data. This arrangement repr u^ 
sented a compromise between the desi rability 
having a large group of item writers and having 
each writer work with a large number of items. 

The six item writers used in this study were 
Selected from the better Students in a preada 
level introductory class in test construction. ^ 
of the writers had received Some instruction = 
item writing. They may be r egarded as nent 
sentative of the large group of potential item wr Н 
егѕ to which recommendations concerning usen 
free response data would normally be addressed. 


Construction of the Tests 
— ne Tests 


After the free response data had been DE 
ed and the writers Selected, the next step ers 
Secure items written under the two methods of x 
problem, that is, with and without the use of fre 
response data. " 

Each writer first constructed a set of ten eed 
response multiple-choice type items onten = 
signed questions without knowledge of the Shee 
sponse data. They then constructed ten addi un 
al items on ten different assigned questions kr 
the knowledge of the free response data. Un Я 
the conditions of the ехрегітепі, it would Se 
been possible for each writer to usethe po 
questions for writing both sets of items. T ted 
would balance out experimental errors associa ke 
with possible differences in the writer’s par 
ground for handing different sets of questi SEA 
But it might also introduce a bias for or ds 
the items written in the second phase (with. a 
knowledge of free response data). Since an е, 
Writer would ordinarily deal with the same TE o 
tion only once, and since the second source of E^ 
ror could not be estimated in this experiment, sa 
Was decided to assign a different set of items i. 
each writer for the second phase of the item wr 
ing. 

"he first set of questions was assigned to € 
item writer by arranging their names in ran ha 
Order, and designating the first as writer A, Е, 
Second as writer B, and so on. The items à 1 
Signed to A were the first ten items of the ed 
Sixty items selected from the free response te 
Writer B received the next ten, and so on. 
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ОШ к It contained а sheet of directions 
эсе : general nature of the study and pro- 
gp a pe followed, a listof items to be writ- 
Write " several sheets of paper on which to 
ien ses ие The list of items consisted of 
Wóndin ^. ems in question form and the corres- 
told че m correct answers. The direc tions 
choice-s Wr iters to construct four respons e 
теп жо items using the item stem and cor- 
irnofess ad presented and providing three dis- 
Sus се ich would pe effective. The item writ- 
the ue to modify the item stem or 
brave th ct response if they felt it would im- 
cha: e item but they were asked to hold such 
- in to a minimum. 
E eee of this first set of items, 
write ind was given another set of items to 
different E ten items on this second set were 
signment rom those on the first set. The as- 
randoml of the second set of items was made by 
ing суйсе; ten items from. those remain- 
excluded. e items previously written h 
vege second set of materials CO 
m ks Sheet, a set of ten item C 
an it ich to write the items. Eac 
em stem, the correct answer; 


of ^ 
Sei tabulated wrong responses. e 
ctions were essentially the 5 eas for the 
told that 


fi “ 

fer Ser вечер: tias the writers were 

data Tiap make any use of the free respons e 
to wri ey wished. The writers were also asked 
ion m a short statement ех ing their opin- 
a "di, value of using free response j 
Writt t of the above procedure, 120 items were 
weie by the six writers- gixty of the items 
Spon: prepared without knowledge o 
such К data, and sixty items W° 
item nowledge. For convenience; 

Кл Len with knowledge of [ree response 
and ie hereafter be referred to as 
Sponso-da written with no kno ledge of fre 

Se data will be called N-1tems: 


Admini 
ministration and Scoring of the Items 


„items writ- 


e sixty K-items and the sixty N А 

used er the procedure described 2 as 

уре to construct two forms ав) of a 

s he The two forms of the споісе-Уре 

о бизше Бу m pating N-items "руп 

A, the The result of this steP sthatonFCr 

ieee а деш ee а N-i em, 

s ЖЕ s -item, š , 

tin B the first item was а K-item, second ор 

orm Ho the third a K-item, ап n. 

tems us had thirty K-items ап Eom 

first Tena items were Leer 

E on e: i 
ka. iam re N-items and K-items 


ї 


was done so that responses to both types of items 
could be secured for any student taking the test 
The forms themselves were distributed alternate- 
ly to the students taking the choice-type test. 

The administration of items written under the 
two conditions was accomplished when the tests 
were given to 303 new men and women students 
College of Liberal Arts, State Univer- 

in February and June 1954. One 
tudents took Form A and one 
tudents took Form В of the 


sity of Iowa, 
hundred fifty-one s 
hundred fifty-two 5 


choice-type test. 
Three scores on the choice-type items were ob- 


tained for each student. The first score was the 
total number of items correct, the second was the 
total of N-items correct, and the third was the 


total of K-items correct. 
‘All three scores were based on number right. 


No correction for guessing was applied to any 


score. 
RESULTS 


use of the method employed in construct- 


ing and distributing Forms A and B, it was antici- 
iela similar distributions 


а roximately alike. 
ЕН n the test for any student con- 
res on the N-items 


of his SCO 
resents the distribu- 


sis 

and the K-items- Figure 1 p 1 

tion of scores on the two types of items. To make 
distributions were 


this comparison, 

а K-items scores for each stu- 
£ which form he took. This 
raphical comparison shows that the scores on 

to be higher than those on the 


tion and difficulty were 
Gai à for the N-items andfor the K-items. 
The data was secured by the Upper- Lower Meth- 
od (3). This method consists of selecting the 
and lowest 27 percent of scores in a 
roup 0 as the criterion groups. The 
: i ion index is the ratio of the differ- 
swering the item cor- 
rectly in upper and lower groups to the number 
oup. The item difficulty index is the 
ratio petween the number in both groups marking 
rrectly to the total number in both cri- 
terion groups- This ratio is converted to a per- 
cent by multiplying by 100. 
distribution of the discrimination and dif- 
5 for the N-items andthe K-items is 


discrimina 
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FIGURE 1 


ATA (K-ITEMS) FOR ALL 
AMPLE TAKING CHOICE-TYPE TESTS 


Raw N Items K Items 
Score (N 303) (N 303) 
30 
a 
20 | 

15 

10 r 

5 

0 
High 27.0 25.0 
95 21.9 20.3 
mm 18.1 16.9 
1 14.1 Ў 
Low 6:0 К, 
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deme Figure 2. The length ofthe rec- 
The hori ar ha dep the inter-quartile range. 
feprese age line running across the rectangle 
Шо: nts the median. Inspection of this fig- 
for w aus the median discrimination index 
Rutten : -items was slightly higher than for the 
rederi The inter-quartile range was a little 
ih са for the K-items than for the N-items as 
from i total range. It may also be observea 
tor bene 2 that the median difficulty indices 
Spout wo types of items of item writing were 
little a Same: The inter-quartile range was a 
йош n for the K-items than for the N-items. 
Hv b rizing the data presented in Figure 2. bt 
fee e stated that the differences between the 
t ypes of items in discrimination and difficul- 
Y appeared to be small. 
PES e precise comparison 
meth HAAA and difficulty in 
varian i of writing was made throu 
Ment e techniques. For this purpose, a “treat- 
quist y subjects" analysis described by Lind- 
for b (4) was employed. Individual writer means 
ed i oth discrimination and difficulty are present- 
each Table I along with the general mean for 
Th method of item writing and for all items. 
ae table for the analysis of variance 
Tu" discrimination indices is presented In 
effect T The test of significance for the m ain 
mad. ts (the two methods of item writing) was 
for E by obtaining the ratio between mean square 
Um rn uai and the mean square for interaction 
the x by writer). The hypo epee is 
inati l hypothesis, i.e., that the mean discrim- 
ten ed index for a large populati 
sam ith knowledge of free response да! 
Кеш as that for a large population of 
the {өп ШОШ the knowledge. 
evel P cannot be rejected att 
Moe confidence, then any el 
Sam e the mean values may be attribute 
+22 d fluctuations. The obtained F-r 

Ore s obviously is not significant. 

i nul hypothesis is tenable- 

еко Lu the mean square for interactio ^ 
Stine in this analysis permits one 

Ose che a population of item writers O 
View of ed in the study are à random 
that m results of the test; it d 
In the st items and item writers like 
ination udy, differences in mean i 
lisheq } if any, are not large enough to 9. 

жаы a study of this scope and precisi": 

ods by mally, no test need be made 
pan ue ТЕ interaction even though iter. 
bor as eis ney is available for ee vaethod 
M tae points out, an intrinsic trea in 
ras expe s’ interaction is taken : 
Me арыы ы design. That iS, в 
nt effec] certain that the methods havnt one ui 
t from writer to writer, О? t 


of the effect on 
dices of the two 
gh analysis of 


one 


use the free response aata better than 
However, it is easy to test this паана A a 
in the present experiment, it is also ine tructive 
The test for significance of interaction is the ratio 
of the mean square for methods by writers over 
the mean square for within cells. The obtained 
F-ratio was 2. 569 with 5 and 108 degrees of free- 
dom. This is significant at the 5 percent level. 
On the basis of this result, it can be said that the 
item writers like those in this study differ in their 
ability to use free response data effectively. 

The analysis of variance procedures used with 
the discrimination indices were also applied to the 
difficulty indices for the same items. The writer 
means, methods means, and the general mean of 
the difficulty indices are also given in Tablel. 
The summary table for the analysis of variance, 
using the mean values as criterion measures, is 


iven in Table Hi. 
The test of the null hypothesis with respect to 
the effect of free response data on difficulty in- 
using the ratio of mean 


dices was made as before, 
square for methods over mean square methods by 
writers' interaction. The obtained F-ratio was 


2.16 with 1 and 5 degrees of freedom. At the 5 
percent value, the hypothesis is tenable that the 
mean difficulty indices for the items written by 
the two methods are the same. 

Since there were also ten difficulty indices 
available for each writer under each method, it 
was possible to compute а within cells mean 
square, and to use this as an error term to test 
the methods by writers' interaction. Theobtained 
F-ratio was .406, which obviously is not signifi- 
cant. If the effects of use of free response data 
on item d tfor.various item 


ifficulty are differen 
writers these differences are not large enough to 
be demonstrated in the present study. 


Comments by the Item Writers 
The directions for the second set of items to 
be constructed asked each writer to make a brief 
statement expressing his opinion of the v alue of 
se data in writing choice-type items. 


free respon i 
Although all of the writers stated that the free 
response data had been of some help, there was 


not complete agreement as to its value. Writers 
D, and E felt that its value was rather limited 
and that items written without the data would have 

y the same as those written with knowl- 


response data. 
SUMMARY 


The purpose of this study was to investigate 
the value of using free response data as a source 
of distracters in constructing choice-type items. 
specifically, comparisons were made between the 

imination and difficulty indices for the items 
group of six-item writers both 
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FIGURE 2 


DISTRIBUTION OF DISCRIMINATION AND DIFFICULTY INDICES FOR 
ITEMS WRITTEN WITHOUT AID OF FREE RESPONSE DATA 
(N-ITEMS) AND FOR ITEMS WRITTEN WITH AID OF FREE RE- 
SPONSE DATA (K-ITEMS) 


Discrimination | Difficulty 
N-Items K-Items N-Items K-Items 


High .63 .76 9L 93 
M м ы, 70 70 

edian .33 .28 57 55 
Q .20 17 45 39 
Low -.02 -.02 


— —Fnr 


MEAN DISC RIMINATION AND DIF 
UT AID OF FREE R. 


CooK 


TABLE 1 


A 


gy ы оо m 


.45 .26 


.35 .26 1 е н Š 
.32 .45 .39 85 ш © 
.30 .30 .30 61 56 5% 
.38 E as $ aN а 
p .29 .24 47 59 5 


Sources 


Methods (m) 


writers (w) 


Methods- Writer? 
Interaction (mw 


81 = 
ME ск OO 96% 
= psc a = z 

F = M$mw 75.13 


Methods: 


73.13 = 
чета _ 2.569 Е05(5 100) = 2:30 
= ES E à 

F = (Sw cells 18.41 


Interaction: 
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TABLE III 


SUMMARY TABLE FOR ANALYSIS OF VARIANCE FOR 
DIFFICULTY INDICES 


Degrees of Sums of Mean 
Sources Freedom Squares Squares 
Methods (m) 1 30.40 30.40 
Writers (w) 5 257.00 51.40 
Methods-Writers 
Interaction (mw) 5 70. 48 14.09 


MS 30. 40 
Methods: Bs. = — = 2.16 Е = 6.61 
MSmw ^ 14,09 05(1, 5) 


i MSmw 14.09 
ышан Fe qS ` ы сыш e .406 F = 2.30 
MSw cells 34.70 05(5, 100) 
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with and without knowledge of the student's re- 
sponses to these same items in free response 
form. 
The following results were obtained: 

l. The frequency distributions of discrimination 
indices for the items written with and without 
the free response data were similar. Items 
written with free response data appeared to be 
slightly less discriminating than those writ- 
ten without the response data, but the differ- 
ence was not significant. 


а The frequency distributions of difficulty in- 
dices for the items constructed with and with- 
out the use of free response data we 
similar. Items written with free response 
data appeared to be slightly more difficult 
than anne written without the data but the dif- 


ference was not significant. 


3. Different item writers appeared to differ in 
the effectiveness with which they utilized free 


response data to write discriminating tes 
items. 


4. Statements made by writers show that some 


. Kelley, Victor 


. Johnson, A. Pemberton. 
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felt that the free response data was helpful 
while others felt they could do just as well 
without such data as with it. 
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ANALYSIS OF CORNEL 
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L ORIENTATION INVENTORY 
HABITS AND THEIR 
UE IN PREDICTION 
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Cornell University 


NE. ener THE end of World War п, several stud- 
provid e been conducted at Cornell University to 
of colle the means for more efficient prediction 
three Mn achievement. Originally, in 
а the Ohio State psychological, the Co- 
Math ive Natural Science and the Cooperative 
ар were administered to all entering 
of re The first group studied was composed 
1942 students enrolled as freshmen Я 
еа 1946 at ће New york State College ° 
predi ture at Cornell university. en each 
ictive variable was studied with the influence 
nly the 


of 
all the other variables held constant, 
State PSY- 


and the Ohio 


se 
condary school average 
o be of 


ch š 
a ан Test score proved t 

The predicting academic ас 
Cient e fourth order parti correlation 
Scho of the college average t? the secon ary 
Ps ol average was . 353 an the Ohio Stal e 
bia DEM score .208 for all groups com- 
the Co The Cooperative Mathematics E 
hight cooperative Natural cience wer 

y correlated with the Ohio ate 

average that 


ic 
2 and with the secondary 50 ool a 
ing шора influence was negligible ! pre Е 
тае college average. ; € 
avelas a of correlation betw en the first- T 
era age and the team of econdary choo! T 
Wie’ and the three tests, eight oa 
Tolle 57 for all students and 64 for th TAS 
of a in the general curric h 
griculture 
j school aver 


1 the secondary т 


In a second study 
glemea: 


a, е 
ше ende was the best sin 
td eren in college- С 
atte continued to be sing 
сој Nene used. Its correlation ompare 
š a average was fO š 
D the previous study. 
Mad follow-up of thes 


ive Mathematics Test. Its correlation with 
:rst-term average was found to pe . 307 com- 
pared to .251 of the Cooperative Mathematics 
d in the first study. The Cooperative 
xception of the secondary 
school average, had the highest correlation coef- 
orrelated with the first-term 
ith the secondary 


and the secondary school 
New York State Regents 
gxamination scores, had à multiple correlation 
251 with the first-term average. 
w-up of these studies the 
ion between the first-term 
1] Mathematics Test scores 
.368 and between the Co- 


Science Te t 
average from . to . 469. The Ohio State Psy- 
.407 with the first-term av- 


to the battery of te 
of Agriculture at Cornell Uni- 


versity- The Test a5 a whole corr elated .454 
with the first-term average 
š art correlated .461. A multiple cor- 


rehension р 

effici of .617 was found between 

d the team of secondary 

, the average derived from the Re- 

ents Exam and the Speed of Com- 
ion Score of the Cooperative Reading Test. 

comparison of the findings of 

th those of the present 


Cornell Orientation Inventory 
In the first study mentioned above, 3 poor 
health, lack of self-discipline, inability to organ- 
and material, too muchoutside work and 


| 
alas 
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extracurricular activities were men- 
кер possible causes of failure in coll ege. 
Therefore, for a more efficient prediction of col- 
lege success, the development and use of an in- 
ventory covering these factors was recommend- 
ed. Such an inventory was developed and given 
to all the 1948 freshmen of the College of Agri- 
culture. The choices for each ofthe 17 questions 
on the original Cornell Orientation Invento ry 
were graded from 1 to 5, one indicating that the 
factor would not hinder the student’s achievement 
in college and five indicating that the factor would 
hinder his achievement considerably. The items 
onthe Orientation Inventory were scored by total- 
ing the choices checked by the students. The 
choices thus totaled provided scores which cor- 
related -.22 with the first-term average. 4 
The present study began with the 1951 fresh- 
men and its data include com plete records of 
freshmen entering the New York State College of 
Agriculture at Cornell University in Septem ber 
1951, 1952 and 1953. The 1949 freshmen were 
given a Revised Cornell Orientation Inquiry® con- 
sisting of 26 items, each having a five-choice 
answer, instead of the 17-item Inventory given 
to the 1948 students. In 1950 Six more items 
were added to the 26-item Inquiry and the result- 
ing 32-item Cornell Orientation Inventory was 
used without any further changes on the 195 1, 
1952 and 1953 freshmen. Approximately 6 per- 
cent of the total number of students inthese three 
classes were eliminated because their scoreson 
the Orientation Inventory or one or more of the 
four tests used for predictive purposes or their 
Secondary schoo] or college averages were not 
available. Our sample in this study, therefore, 
is a total of 813 students in the College of Agri- 
culture at Cornell University comprising 94 per- 
cent of the entire freshmen classes in the aca- 
demic years 1951-52, 1952-53 and 1953-54. 


Purpose 


The purpose of this study was to determine 
the relative validity of the items on the Orienta- 
tion Inventory compared to the validity of sever- 
al aptitude and achievement tests and secondary 
school averages and to develop a mul tiple- 


predicting first-term col- 


Procedure 
=e 


Secondary School Averages and scores on four 
tests were used for the 


t ° development of a prelim- 
Inary regression equation. 


A ts 

Psychological Test. б Al the standardized ie 

were administered according to instruc 

in their manuals. — 
The first-term college averages of the 


EON e No es- 
dents were useq as the criterion variable. 


5 
timates of reliability were available for шашат 
in any of the courses. For the 813 studen eet 
idity data are presented in Table III. i eh Mer 
mean and standard deviations, orae nmn 
intercorrelations of the predictors are s айа 
From these data, with the exception of the ed 
for the Cornell Orientation Inventory, the m 
ple-regression equation 


Xc = .388Х + .077X) + .157 X4 + . 119 X4 
+ .007 Xs + 22. 783 


es 
was developed for predicting first-term DO 
(1,2). The multiple coefficient of correlat goel- 
for this equation was found to be . 536. ar nich 
ficient of multiple determination was . 28 ep 
Shows that we have accounted for 28. 7 perce 
the variance of freshman averages. | first- 
On the basis of this regression E 9p alii 
term averages were preaicted for all the f anj 
dents. If the actual first-term aver aeS acted 
Student was greater or less than the ма or 
first-term average of the same student by that 
more points the Orientation Inventory fo ud sep- 
particular student was sorted into one of f those 
arate groups. One group was composed o е the 
students making more than five points abov 


: equa- 
score predicted for them on the basis of the edt 


tion. These were called the “over reeks 
The second group was composeu of those и the 
uals making more than five points lower gn 
predicted score for them. These students s em 
called the **under-achievers". The respon RO 
made on the Orientation Inventory by these 


А unt- 
groups were then analyzed for each item by CO 


by 
ing the number of responses that were е а 
members оѓ each group to each of the five m ahi- 
possible in every item. In computing 


^ com- 
squares, adjacent frequencies have been 


un- 
bined in order to avoid expected frequencies | E 
der five so as not to violate the requi rem 
not-too-small expected frequencies. 


Results 


santi 
Eight of the 32 items on the C таео at 
tion Inventory produced very low probabil were 
the chi-square tests of significance. gets the 
items 4, 9, 10, 13, 15, 22, 24, and 25. of .01, 
exception of item 10, with a ey cap Rn à 
and item 15, with a probability of . 02, i is the 
had the very low probability of < . 01. Т degrees 
probability that with a given number of d in the 
of freedom the chi-square value dese i wit 
comparison of the distribution of the samp 


> 
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th i 

кро, series dictates that the sample 

hen im or has arisen out of such a series. 

pd ght items which proved to have predic- 
value are listed below: 


A 
) Item 4. I neglect certain courses because of 
lack of interest. 


^ Never 2. Rarely 3. Occasionally 
. Almost always 5. Always 


B)I š š 
) Item 9. When in high school it was necessary 
for me to study at home 


2. Several nights 


1. Almost every night 
r week 


vw week 3. One or two nights pe 
. Infrequently 5. Never 
С) š 
) Item 10. It is possible for me to concentrate 
on my studies 
ü Under almost any conditions 2. When 
verything is quiet and I am alone 3. Af- 
iod 


z a relatively long warm-up perio 
- Only when intensely interested in the 


material 5. Under practically no condi- 
tions 


ent which best de- 


D 
) Item 13. Choose the statem 
el about your note- 


Scribes how you fe 

taking. 
S Exceptionally neat and efficient —no re- 
mass necessary. 2: Neat and ef- 
nas s little reorganization 18 some- 
imes necessary 3. Some reorganiza- 
tion always necessary 4: Considerable 
reorganization always 
possible to get my notes 


emain in college, i 


E 
) Item 15. If I am to r 
r me to earn 


be necessary fo 
2. My board ar 


4, My room 
y expenses 


= All of my expenses 
ESI 3. My board 
ery small part or none of m 
F 
) Item 22, Choose the statement which best d°- 
ize your time. 


scribes how you organ 


1. I am usually able to do 2 lit 
аш is required of me in my courses 
. lam never behind with m 

Signments in my courses +, 
that I can do to keep UP wi 
work required in my courses 
quently do not have time to И 
mum amount of work required in my 
Courses 5. I never Seem 
to do the minimum amount 
quired in my courses- 
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G) Item 24. Before registering at Cornell, I 


1. Visited the campus many tim i 

home isnearby 2. Visited the ae 
participated in4-Hor high-school ac tivi- 
ties 3. Visited the campus to talk about ad- 
mission 4. Visited the campus to attend 
athletic events 5. Had only driven 
through or had never visited the campus 


The average number of hours I spend 


H) Item 25. 
on study each week is 


1. 30 or more 2. 25-29 3. 20-24 


4. 15-19 5. Under 15 


le exception of items 15 and 


With the possib 
s were in the “Study 


24, all the significant item 
Habits” area of the Orientation Inventory. The 

items in the areas of «« Motivation" and *Adjust- 
ment” do not appear as well represented in the 

final analysis. A partial score was then comput- 
ed for all the 813 students scoring onl 
items mentione These partial sc 
were then correlated with the criterion and with 
the other prediction variables. The coefficients 

shown in 


these correlations are Table HI. 
items on study habits did not cor- 


relate very highly w th erion measure, 
ion being only 

s of correlation with other 

-.100 and below, 

ultiple-correla- 


d above. 


-.257, their coefficient 
much lower, 


tribution to the m 


ions and intercorrelations 
ta weights and regres- 
were obtained for eac h prediction 
hese are presented in Table IV. The 
ession equation was now changed in- 


(as shown in 
sion weights 
variable. T 
previous regr 
to the following: 


с = :366Х1 + ота ха + .150X3 + . 110 X4 


+ .007 Х5 - .539X6 + 36. 794 


x 


which has à slightly smaller positive weight for 
each of the previous prediction variables and a 
for the study habits 

score of the Orientation Inventory. This rather 
i ight for the OrientationInven- 


then to its very small standard deviation as com- 
ared to the standard deviation of the criterion 
2.799). 


variable (9c/0g - 


The multiple coefficient of correlation (R) for 
this second multiple-regression equation was 
found to be .569 compared to .536 for the first 
equation. The coefficient of multiple determina- 
tion this time was .323, compared to .287 for 
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the first equation, which shows that we have 
accounted for 32.3 percent of the variance 
of freshman averages when we included the 
Score on the study habits part of the Cornell 

Orientation Inventory among the prediction 
variables compared to the 28.7 percent that we 
had accounted for by the previous equation 


without including the Orientation Inventory 
Score. 


Summary and Conclusions 


low, -.100 and lower, which resulted in a 
rather large re 


large beta coeffic ient, 


In the light of these facts, it would appear 
that 


1. Research in this area Should be continued. 

+ The scope of the study habits part of the 
Cornell Orientation Inventory m ight be in- 
creased. 

3. Other items having to do with coll ege ad- 


justment and motivation might be construct- 
ed. 


FOOTNOTES 


* The tests in this study were administered in 
1951, 1952, an 


d 1953 by the Cornell Univer- 


** 


Sity Testing Service. The Cornell O rienta- 
tion Inventory was originally developed in 
1948 by Dr. John P. Hertel and Dr. Francis 
J. DiVesta. 


Now Assistant Professor of Psyc hology at 
Lake Erie College, Painesville, Ohio. 


Parviz Chahbazi. “Prediction of Achieve- 
ment in a College of Agriculture, ”’ EID 
tional and Psychological Measurement, 
Winter 1955), pp. 484-87. 


Parviz Chahbazi. “Use of Projective Do 
in Predicting College Achievement, EVI 
tional and Psychological Measurement, 
Winter 1956), pp. - 


я а 
Francis J. DiVesta, Asahel D. Woodr pee 
John P. Hertel. “Motivation as a Pre p 
of College Success,” Educational and 


chological Measurement, IX (Autumn 1949), 


pp. 339-48. 


€ 
Robert L. Egbert andGlennR. Hawkes. vd 
of an Orientation Inquiry as an Aidin Pre e 
ing Success in College Agriculture C urri en 
lum,"' Journal of Educational Research, 


XLIV (December 1950), pp. 295-302. 


Hertel and DiVesta, op.cit., p. 394. 
DiVesta et al., op. cit. ‚ p. 342. 
Egbert and Hawkes, op.cit., p. 295. 


e 
For detailed descriptions of these tests S€ 
footnotes 1, 2, 3. 
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THE RELATIONSHIP O 


F PEER GROUP RATING TO 


CERTAIN INDIVIDUAL PERCEPTIONS 
OF PERSONALITY 


J. W. RAMEY 


Teachers 


is degree will doctoral students who 
ment A ar with one another in 2 depart- 
tion came C themselves asagroup? This ques- 
ticles by C o mind as a result of reviewing ar- 
of vede icu and Cottrell (2, 3). The tendency 
are working t to adopt a “we” attitude when they 
On certain g toward common goals, and to take 
ness again syntality traits, such as aggressive- 
in commit st groups, isolationism, OT reliability 
influence ments, might well be strong enough to 
investi Jap students. The normative influence 
and p by Asch, Gorden, Ho Shonbar, 
e icon a, 4, 6,8, 9), might also cont: ° 
Out the diim adoption of group norms, With- 
ing inf ividuals awareness that they were be- 
A luenced. 
for afin of students particularly wel 
ш Aul, ае аа 
School who: department of a 1 
ers of d o were in full-time residence 
dents беа tmental assistanceships- These stu- 
class p thrown together both in and out of 
as heme ok ah much of the day and evening, 
ment h Ae the doctoral studen 
campus d full-time outside jobs and 
If se only while attending class in 
mental eni of the students who e 
a “Pee ssistants were arbitrarily desi; 
each Ed Group", and aske th 
Pàrtme er, and all doctoral 
Bose үү (Ошо), would they tend 
9r would image of their «peers 0 
a differe they react as though there — 
ency ванае «peer Group” роеѕ the 
$nce de individuals to adopt grou n : 
ike th ch individuals to rate themselve? more 
to tlieir реве" or 8 orans or Wil 
This no difference? 
the paper reports an 
cbe e of answer 
ically, two hypothese 


al stu- 


students in 
to super! m- 


experiment car i 
ing thes e s 
s were 


for 


Р Self ratings will correlate as sig! 
with ratings of most doctoral tudents 
(Others) as with ratings o! s 
- Peer ratings will correlate & 


College, Columbia University 


most doctoralstu- 


cantly with ratings of 
s of Self. 


dents (Others) as with rating 


es from the 
were used t 
They seemed to be 


Six scale scor 
logical Inventory 
this experiment. 
tinent to this use; 


manu 
chosen on the basis of the test 
that it identifies random àn$ wers, a potenti al 
roblem in this experiment. 
identified in the CPI test 


The six scales are id 


Dominance: Assess those factors of leadership 
ability; dominance, persistence and social ini- 
tiative expressed by such words as: aggressive, 
confident, pe lanful; as being persua- 

b: ‚ as self-reliant and in- 


; 
g leadership potential 


An index of an individual’ s 

for status (not his actual or achieved 

The personal qualities and attributes 
tus, such as: be- 


and lead to stai 
ndent and selfseeking; effective in com- 


. ambitious, active, forc eful, in- 
resourceful, and versatile, having 
readth of interests. 


Capacity for status: 


capacity 


Socialization: The degree of social maturity, in- 
tegrity, and rectitude which the individual has 
attained, for instance: serious, honest, modest 

conscientious and responsible; se L E 
and conforming, and industrious. 


Conformity: The degree to which an individual's 
reactions and responses are dependable, mod- 
tactful, reliable, sincere, patient 
and realistic; having common Bense. 


and good judgment. 
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Intellectual Efficiency: To what degree is the in- 
dividual efficient, clear-thinking, capable, in- 
telligent, progressive, thorough, resourceful; 
alert and well informed; placing a high value 
on cognative and intellectual matters? 


Flexibility: How adaptable is the person's think- 
ing and social behavior? Is he insightful, in- 
formal, adventurous, confident, humorous, 
rebellious, idealistic, assertive, and egotis- 


tic; sarcastic and cynical; highly concerned 
with personal pleasure and diversion? 


female doctoral Students. 
were introduced through th 
male subject. All of the s 


ished Scoring the ques- 
; he was given a new scoring 
for himself. This 
bered and marked “Self”, 


, In this manner, each individual was rated by 


definitions provid- 
D €S on which he had rat- 
ed himself and “Others ^, and within the same 
i + Each subject was 


all subjects on one characteristic before moving 
n to the next characteristic. " 
Ы All subjects cooperated willingly in the а 
iment although it {оок from 90 minutes to 18 hoan 
utes to complete the task. Each subject the 
asked not to communicate with the others in a 
“Peer group”, regarding the experiment, an 
far as is known, this request was honored. 


Results 


Most subjects consistently rated Self pe re 
than “Others” and in no instance did any se 3 
consistently rate “Others” higher tha qt hes 
Peer rating and Self rating means were ГЕ те 
than the CPI mean for college students, bu CPI 
“Others” rating mean was lower than the 
college student mean. 

Raw Scores were converted into standart 
Scores, using the tables in the CPI manual. cach 
ranges, means, and sums of the squares for аш. 
characteristic are indicated in Tables I, H Pesis, 

Small sample statistics and the null hypo 


Е iffer- 
using the Fisher t formula for testing the di 


sam- 
ence between uncorrelated means when uud fof 
ples are of equal size, seemed appropria 


: 5) 
this experiment. As reported by Gu ilford (5), 
this formula is: 


where M, and M, are the means of the 
Samples; Zx2, and Ух2, are the sums o 

the squares in the two samples; and Ni i$ 
the size of either sample. 


Since Ni - 12, there are eleven degrees d 
dom, thus a confidence level of 2. 201 (590) n 
3.106 (1%). Applying the formula, where nd 
rating, O - Others rating, and P = Peer ra 
the t Scores are shown in Table IV. the- 

These t scores indicate that the null hypo nfi- 
Sis can be rejected with a great deal of він 
dence for half of these measures. This St an 
ment applies to Socialization, Communal y, af 
Flexibility, when comparing Self rating and cial- 
rating; to Dominance, Capacity for Status, ei ity, 
ization, Intellectual Efficiency, and Flexi b: rat- 
When comparing Peer rating with “Others 


H ar- 
ing; and to Intellectual Efficiency when comP 


2 w 
ing Self rating to “Others” rating. A rather 10 


rer 
correlation is also indicated for four of the 
maining t Scores. 


Tentative Conclusions 
—eusions 


Our h 


#- 
ypotheses that 1) Self rating would co 
relate a: 


è dot- 
Š significantly with ratings of most 


RAMEY I 
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TABLE І 
STANDARD SCORES FOR TWELVE DOCTORAL STUDENTS SELF RATED ON TEE 
CHARACTERISTICS: DOMINANCE, CAPACITY FOR STATUS, SOCIALIZA- 
TION, CONFORMITY, INTELLECTUAL EFFICIENCY, AND FLEX- 
AS DEFINED IN THE CALIFORNIA psyCHOLOGI- 


IBILITY; 
CAL INVENTORY 
Cm le Fx 


So 


= cs So cm 
78 - 1 qa - 2 59 - 1 63 - 1 67-1 13-1 
16-1 61-1 58 -3 58 - 3 64-2 64 - 4 
70 - 1 65 - 3 54-1 54-4 62-1 61-2 
68 - 1 62 - 2 52-1 49 -1 60 -3 59-1 
66 - 2 60 - 1 01-4 45-1 49-3 56-1 
64-1 57-1 49-1 40 - 2 47-1 53-1 
62-1 46-1 4-1 36-1 44- i 
94 -1 39-1 43-1 41-1 
42 -1 40 -1 

39-1 35-1 

33-1 


Zf(x')? 
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TABLE II 


Cm le Ех 
10-1 62-1 63-1 58-4 58 =] 73-1 
62 - 1 60-1 56-1 54-3 56-1 61-1 
48-1 57-1 Бі sd 49 -2 54-1 59-2 
46-2 49-1 41-1 40 - 2 43 -1 56-1 
42-1 44-2 45-2 26 -1 


: RAMEY 
ond 
À 
rt 
| 
TABLE Ш 


+ 
‹рЕЕВ GROUP” AS RATED BY TWELVE DOCTORAL STU- 
' CAPACITY FOR STATUS, 
CIENCY, AND 
PSYCHO- 


STANDARD SCORES FOR 
DENTS ON THE CHA 
` SOCIALIZATION, CONFORMITY, 
FLEXIBILITY; AS IN THE C 
LOGICAL INVENTORY 


+ 
80-1 80-1 
66-2 10-2 66-1 49-2 64-1 
62-3 61-5 65-3 45-4 62-3 13-3 
E 60 -3 65-1 63-3 40-3 60-3 10-4 
| 58 - 1 62-1 59 - 1 35-1 56 - 2 64-1 
Ñ 52-1 60-1 58 -2 31-1 54-1 
43-1 
| 44-1 54-1 
M = 44 M = 59 м =73 
ntes M = 6° de 48 454 201 
- Zf(x')?- 796 430 5 
| 
| 
p vt 
Í 
ie, 
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t-SCORES FOR SELF-OTHERS, SELF-P 
ISTICS: DOMINANCE, CAPACITY 
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TABLE IV 


INTELLECTUAL EFFICIENCY, AND FLEXIBILITY 


Cs So Cm le 
. 2044 1.494 5.384 2.555 . 9168 
4.132 5.677 6.389 1.829 9.172 
- 3187 . 3571 1.657 


.5681 4.222 1.570 


TABLE V 
RANK ORDER © 


ORRELATIONS FOR SELF 
AND SELF-OTHERS 


; FOR THE CHARACTERISTICS: DO 
ANCE, CAPACITY FOR STATUS, SOC I ALIZATION 
COMMUNALI 


AND FLEXIBILITY 


SP OP SO 
Dominance . 74 .30 .36 
Capacity for Status «92 .01 1T 
Socialization -.16 -.21 -.09 
Communality -.16 23 58 
Intellectual Efficiency . 47 .30 . 04 
Flexibility 24 -.05 14 


ER- 
EERS, AND OTHERS-PEERS, FOR THE син 
FOR STATUS, SOCIALIZATION, C ON FORMITY, 


RAMEY 


Mid students (Others) as with ratings of Peers", 
nd Peer ratings would correlate as signifi- 
(Oth y with ratings of most doctoral students 
Lee as with ratings of Self; produced unex- 
in z results. First, for this sample, Self rat- 
ea correlated more significantly with ratings 
in o detona, students (Others) than with rat- 
=a of “Peers”. Also, Peer ratings correlated 
in e significantly with Self ratings than with rat- 
ES of “Others”. (See Table IV.) 
"m. it was to be expected that peers would 
ean poe other rather closely to self rating, an 
Š ine овер matching between self rating and the 
aynina of Others’? was quite unexpected. Re-ex- 
Roos eq of the procedure used and the ratings 
that ined brought to light one disturbing factor 
this P account for such results. Therefore, 
infl actor was checked to see how much it had 
uenced the findings. 
PE viduals, in their rating of peers on 
peu provided, were found to have bunc hed 
Ps e of their ratings, in a few instances, rat- 
si as many as four or five peers at the same 
tenda on the scale. In order to overcome this 
ask ao: each subject was again contacted, and 
їй ed to rank-order the peers on the scale. 
ct Self and ratings of “Others” were then 
cna -ordered and rank order correlations were 
in puted for each of the six characteristics, us- 
8 the Spearman (7) formula: 


E Spearman rank order correlations obtained 
pe in Table V. 
ain Self rating and Peer rat 
ane more Р Аит than Peer rating and m 
Seit x “Others”, as expected. We also note tha 
nifi rating and Peer ratings correlate more sig- 
and dy than Self rating and “Others” ratings 
Ga Seems more likely than the result a 
ing E with unranked ratings. It iS res 
ve at although none of the correo" $ 
ry high in the OP and SO columns in Table 
the to bear a marked resemblen 


ings correlate 


Conclusions 


"s can conclude, therefore; 
with this instrument, that: 


1. Self rating will corr 
ly with rating of «peers? th: 
of most doctoral students ( eee 

2. Peer ratings will correlate more агы š 
cantly with Self гаііпв5 than with T? ing 
of most doctoral studen 

3. Ratings of most doctoral st re 


will correlate as significantly wit 
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of Self as with ratings of ‘‘Peers’’. 


These results certainly do not prec lude the 
possibility that with different subjects and/or dif- 
ferent instruments, different results might be ob- 
tained. For this particular gro up, however, 
there would seem to exist at least the rough 
framework of a peer group, distinguishable 
from the universe of doctoral students in the de- 
partment. This tendency to identify such a group 
might well be entirely due to halo effect caused 
by singling out these particular students for peer 
ranking and rating. The effect of this tendency 
could be checked by repeating the exp eriment 
with à different group of subjects selected com- 
pletely at random within the department. 
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A COMPARISON OF OBLIQUE AND 


ORTHOGONAL FACTOR SOLUTIONS 


RICHARD W. COAN 
University of Arizona 


T 
le HE CHOICE OF rotational methods is one of 


Liven unresolved issues in the field of factor 
lewpoints In this country, the principal discrepant 
Wilford on this matter seem best represented by 
ога and Cattell. The position of Cattell (2) is 
me [ien oblique factors better represent the 
deal 1 psychological entities with which we 
Te атеш! for these entities as they occur in па- 
inty ina ate Guilford (3) fee ls less cer- 
t this alleged fact has been demonstrated 


th He prefers mean 

Беата г o gonal factors on the ground that the 
lence сре of obliqueness is € ommonly a Conse 
of inadequate test sampling or of insufficient 


c 

| ae extraction, 

[enables non argument for oblique rotation is that 
k es us to achieve better simple structure, or 


ett 

азы, approximation to simple structure, © 

SSible with orthogonal rotation. But the 
nds of economy alone, 


o 
r s: resolved on the grow jon 
Vel at rotational method affords simplicity on one 
eco, the expense of simplicity on another. The 
ег 0my cannot readily pe weighed against the 
М Vand eo: simplicity of factor pattern (or struc 
ht open correlational independence of 
Ne i urate phenomena. A 
Der; nical obstacles as the 54 
bs im ns and envision the state of affairs that We 
ions ve with optimal sampling from both popu 
t оу We can see that the ultimate solution to the 
t for ersy depends on establishing 2 meaningful 
E сопот data. Тһе fundamental question 15 no 
Ë elegana? of the mathematical тобе» 
М Will ч) апа theoretical productivity- 
as est provided by the то el t 
йн, to reconcile h; theses arising £ оз 
Ong, e of related, bu independent, factor! 
ne In this paper, we shall not atte m 
Carne of factor reprod ucibility, D^ 
ll To a nsider a number of subsidiary prob в И 
раі broach a solution to the problem 0 
° een 200, we must consider the rela 
һа Ое the alternative procedures wi 
1 ation Xis in which we already possess & 
oa su m PTO DAE C We © 


e the possession of such informa- 
dealing with simple physical meas- 
blems provide a 
In the earlier of these (7), 


Thurstone assigned scores on 
ical boxes, each variable b 


justifiably assum 


In a later study (8), scores on 265 
ined for a sample of 30 actu: 


In the nypothetical case, 
led to an oblique solution 
tal measurements to corre- 


ond quite SO neatly to 


orthogonal factors 

e's earlier box prob- 
ariables, he de- 
solution can be convert- 


that 
monstrata; lent orthogonal solution with four fac- 


hich correspon 

hich corresponds to the 
r which Thurstone's proce- 
's analysis adds some 


es shou . i 
ахй to the аг ument that арра rent obliqueness 
may result from incompleteness of factor extrac- 
since thogonal solution could have been 

. ned directly by rotation infour dimensions, Not 
obtaine ill rotate in four dimen- 


lists” W1 
when tests of completeness of ex- 


icate three factors. 

jlar mathematical basis, Schmid and 
А ve more recen y proposed converti 

Leiman (5) have “to hierarchical orthogonal rng 

Their procedure yields as many orthogonal 
a common order as there are factors of 

all orders combined in the oblique solution. Cor- 

respon to the oblique factors of highest order 

ill be very general factors. Lower е 


there W! 
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sist! 
might be expected. The chosen bes cial i as t0 
will be a set of simple-structure orthogonal factors of 100 chicken m denne i commercially 
which resemble the first-order чадав ey d in oh da poe ie s 24 small, 24 medium, 
š : Be 
Seen таси and Leiman em ribi f р large and 24 jumbo eggs. were 
ohl a model preserves the EN mS I , bject, these six measurements um 
oblique solutions but eliminates the obscurity of in- For each subject, linear length, (c) maxim of 
formation inherent in the use of Correlated axes, obtained: (a) weight, (b) lin istance from the axis al 
To follow the procedure of Schmid and Leiman, of linear breadth, (d) linear " e., sharper) end ^ 
Course, we must make an assumptionwhichis bas- maximal breadth to the smal s апа (f) тах í 
ic to the execution of any higher-order ana lysis -- the egg, (e) lengthwise circumfe , р 
namely, that {һе departures from Orthogonality in imal breadthwise circumference. ibresi through 
our original oblique solution can be m eaningfully These measures constituted var: for every Du 
related to recognizable characteristics of the test- 6 respectively. In addition, the "ass. 4 giving 
ed population, of the original measures was deter mine "number? 
TOm the work of Thomson (6) andof Schmid and the following 15 variables, which were 
Leiman (5), it is evident that a Cleanly rotated ob- from 7 to 21: b/f 
ique solution can be converted to an orthogonal so- 7. a/b 15. /à 
lution that Preserves much of the Simplicity of the 8. a/c 16. "y | 
oblique factors, It does not follow, however, that 9. a/d 17. c s i 
independent applications of oblique and orthogonal 10. a/e 18. c/ 
rotation y ordinary Procedures wil] result in fac- ll. a/f 19. d/e 
tors which match so neatly. Muchof the Controver- 12. b/c 20. d/f 
SY Over the choice of methods relates to the rela- 13. b/d 21. e/f 
tive ease with which We can identify or interpret 14. b/e s wert 
factors derived directly from an Oblique or ortho- The scores for each of the 21 variable dete! 
gonal solution, It is partly at this issue that the malized, and intercorrelations were шел ior 
c d ce aimed, А тіпеа by use of the Pearson List ura ay which | 
ne evidence affords а Meager basis for hy- mula. The advisability of normalizing d scale 
potheses 1 Sarding differences between oblique ang are already expressed in terms of physical ut We 
orthogonal Solutions in factor Content. One fairly 9f equal units will be questioned by some, ing that | 
Safe prediction, from Thomson's analysisand from may sidestep this area of controversy by пойна ре 
broader mathematica] Considerations, is that orth- departures from normality in the original dis malc 
ogonal rotation will tend to Produce general factors tions were generally slight, The effects of pubes 
Iris ранае e e s ique solution ization on our final results were probably iw ше 
ill, of course А 4 А wn 
if thereat appreciable obliguen ris cee порна, Тһе intercorrelations are sho |. 
order factors, ) g The correlation co 
S a further hypothesis, We mi, T 
lique rotater's Content; Eht take the ob te 


matrix was factored by the itie? 
ion that his fac 


un у 
lua d Сепігоіа method. For this purp val аво? 
Я ; е eas- Were inserted in ls of the principa na. 
i M aoe than those ofanorthogonal Solution, It should be noted that tac Ве ри the comms op 
age testing such an reat Satis actory crj- ities of our 21 variables are actually unity, urere? 

i 15, however, even each origi s into six er 
if we treated the behavior of interpreters asa focal variables, what wong ne Кү specific ane c^ 
dependent veriable:. The related Problem of factor ror variance ha, аи а іпіо соттоп 
reproducibility is more Teadily subject to experi- ‘ance has been Converted i 
mental test, though the data to b 


tor variance, 
€ Teport ed 
bear less directly on this. At the pr terg 


f um 
esent of, potained communalities Which falljustshort? py U* 
research, it may be more useful to keep 5 a ecause of minor inconsistencies ameet eo, 
eral considerations in ming andsi Y See i os "hod pu rounding of scores. Ifa vein inor H 
develops in an essentially exploratory st dy. Our ; ation formula had not been used, the pfi? ip 
findings should Suggest more refineg ypotheses ects of grouping would have been 

t can be subjected to more careful Scrutiny in Source of error variance 
Subsequent research, 
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tio 
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umber of tests for completeness of ext 


n lear ae 
Were applied, and these showed Eads d sigh 
The Present Problem Sreement with Tespect to the presence of s te 
“te Present Problem 


= ec" 
leant factors, e might reasonably have exp іх 
ght a Tealm of this, Since the 2 


; 1 variables are all sanction н * 
ich satisfactory a priori know- independent measurements, Each apk sm Th 
ledge of Structure could pe assumed; Something 4 tained With six factors is barely less E ‘six cen 
it more complex than the box problem Seemed de. Centroid matrix is shown in Table II. 
Sirable, so that Some divergence like that usual] 
found between obl 
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TABLE I 


ORRELATIONS AMONG VARIABLES 
Variable Number 


Variable 
umi 
рар s 6 7 g 9 10 11 12 13 14 


15 16 17 18 19 20 21 


96 26 -07 04 06 -10 01 


a 81 вә 69 92 92 96 97 93 97 

| 65 88 93 69 65 ва 62 71 84 52 19 43 54 286 чїй 17 385 55 
: 52 81 95 90 77 вв 86 78 -29 28 -29 -22 34 26 29 -30 -30 -18 
: gi 54 53 73 4o 59 T 5 go 4 56 5% -53 -07 55 62 59 
: в5 81 90 78 g2 89 26 љ n 27 =M -32 -02 -02 12 94 
| 93 83 91 89 go -18 31 121 £3 1 їй  -32 -30 -17 
н g1 97 98 go -17 25 at- 24 9 o4 -27 -26 -13 
I вт 94 98 a 23 0 m -o8 -27 -12 -07 04 22 
: gp 86 -19 47 -27 -21 37 13 05 -45 -39 -17 
" g4 -06 26 -13 -07 15 o2 03 -20 -17 -04 
z 19 n x 19 -08 -24 00 -04 07 25 
is -og 87 95 -s4 -94 -41 56 78 89 
i Jos 4p 55 10 oj -83 -62 -15 
i өд - 67 -12 5g 73 76 
25 -g1 -82 -10 61 83 95 
16 gi 34 -89 -96 -80 
ы s5 -48 -72 -86 
^B -08 -11 -09 
м 92 56 
ins 82 
21 
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TABLE II 


CENTROID FACTOR MATRIX 


Centroid Factor 


Variable " 
Number 1 2 3 4 5 6 h 
a а ыы да с сз oe с 
1 75 65 -01 -04 -03 02 .994 
2 93 20 16 23 10 -03 .995 
3 49 81 -18 23 10 -01 .994 
4 95 -04 -25 13 09 03 .993 
5 8 45 06 15 18 13 .991 
6 56 78 -08 02 24 -10 .989 
q 59 78 -10 -15 -07 -01 .992 
8 8 52 07 -15 -09 04 .992 
9 50 84 10 -09 -08 -04 .987 
10 66 71 -06 -13 -16 -08 .994 
11 82 52 04 08 -18 09 .987 
12 64 -66 38 02 -01 -06 .988 
13 -09 53 81 17 -01 -11 .990 
14 48 -64 31 32 -23 -31 оз 
15 62 -64 29 29 -17 03 .993 
16 760 79 14 06 -о2 -o2 995 
"n 765 59 -38 18 -14 -14 ogo 
18 71 ?n -30 m -4 s. 986 
19 42 -73 -51 05 -09 -06 987 
20 56 -77 -21 12 -10 09 „м 
21 


rotation was facilitated by the use ofa digital-com- 
puter operation whereby each successively calculat- 
ed factor or reference-vector matrix was automat- 
ically plotted by an oscilloscope on film. Attention 
was directed to the actual variable c o m position of 
each factor only after a finalsolution had emerged. 


The Oblique Solution 


The oblimax routine for Illiac was usedto secure 
an initial rotation. Further rotations from two-di- 
mensional plots led to the final solution 5 hown in 
Table Ш. The corresponding transformation ma- 
trix and the matrix of intercorrelations among ref- 
erence vectors are shown in Tables IV and V. The 
reference-vector structure manifests an unmistak- 
able simple structure which couldbe improvedonly 
by very minute shifts. Somereaders may feel con- 
cern about the high correlation between reference 
vectors 1' and 5'. Such extreme obliqueness is Or- 
dinarily avoided, since two factors or reference 
vectors derived from most kinds of ps y c hological 
data would not be clearly distinguishable if they 
were permitted to correlate so highly. But if a fac- 
tor analyst adopts the position tha t the obliqueness 
of his factors has some valid meaning beyond the 
peculiarities of his sampling of tests and subjects, 
then to be consistent, he should permit any degree 
of obliqueness demandedby hisdata. He should, in 
Short, insist that each factor rest onits own merits 
in a position solely determined by its hyperplane-- 
So long as it is meaningfully separable from 
other factors. In the present case, factors 1' and 
5' have distinct loading patterns as well as distinct 
(but related) meanings. 

If we accept the correlation between 1! and 5' as 
permissible, then we have what may well be re- 
Barded as a “unique” solution. The reader, O 
Course, is invited to seek a better solution. 
Should be noted that, in the present solution, hyper" 
plane values are virtually confined to a range from 
~.02 to +.02. But even if we define the hyperplane 
in terms of values between -.10 and +.10, we have 
а Set of factors which, by Bargmann’s (1) criterion, 
are all significant at well beyond the . 001 level. — 

. For interpretive purposes, it is useful to con 
Sider as well the factor-pattern matrix, the factor- 
Structure matrix, and the matrix of factor intercor- 
relations, which are shown in Tables VI, VU, and 

The oblique factors pose few interpr etive 
Problems, Considering factor structure, we See 

t factors 1' and 5' are size factors. In terms of 
faci ore distinguishable pattern characteristics, 
fant 1' 15 Seen to be more specifically a volume 
et its pattern loadings being confined essen - 
НУ to the length and breadth measurements.Fac 
TI r 5' is most simply designated as а weightfactor. 

here is a temptation to consider it a density fac 

T, since the ratio variables--7, 8, 9, 10, 11-7 have 

igher loadings than does weight itself. The im- 
Pression is misleading, however, for part of the 
Unit variance assigned to variable 1 is necessarily 
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absorbed into factor 1'. While variances are equat- 
ed within the context of our analysis, variable 1 is 
actually of greater generality (апа ina sense, of 
greater variance) than any of the geometric meas- 
urement variables. In support of the weight inter- 
pretation, we can note the high correlation between 
factors 1' and 5! and the fact thatvariable 1 actual- 
ly has the higliest correlation with factor Б! (cf. 
Table VI). The equating of variances actually 
causes some difficulty throughout our interpreta- 
tions, for while it may not effect patterns per se, 
it produces distortions in the order of loadings 
within patterns. 

Factors 2', 3', 4', and 6' all relate in some way 
to shape. Factors 2' and 6 ' relate to breadth. From 
the pattern loadings, we can see that factor 2' rep- 
resents, in reflected form, the contribution of the 
breadthwise-circumference measure. Factor 6' 
represents the contribution of the simple maximal- 
breadth measure. In neither case do we get a sim- 
ple breadth factor per se, for itis prima rily the 
ratios formed by the two breadth measures that ac- 
tually cluster about the factor axes. In both sets of 
pattern loadings, variable 18 is most prominent. 
This variable does not actually correlate too high- 
ly with either factor, however, since it is a meas- 
ure of sidewise flattening, rather than of breadth 

as such. It must also be recognizedthat because 
of the high correlation between maximal breadth 
and breadthwise circumference, errors of meas- 
urement and of rounding may tend to overshadow 
true variance in the ratio of the two measures, 

Factors 3' and 4' are concerned with length. 
Factor 4' is more clearly identifiableas a factor 
of relative length, or simple length with volume 
partialed out. Most prominent among both the fac- 
tor-pattern and the factor-structure values is the 
ratio of linear length to lengthwise circumfereme. 
The contribution of a partial length variable (var- 
iable 4) is central to factor 3', which might be call- 
ed a length-of-long-end factor. The ratios of the 
partial-length measure to linear length and to 
lengthwise circumference (variables 13 and 19) ap- 
pear more prominent among the factor-pattern and 
factor-structure values than does variable 4 itself, 
This suggests an interpretation in terms of length- 
wise proportions, but the fact is a consequence of 
variables 4's relevance to factor 1'. We mustbear 
in mind, however, in interpreting all four of the 
non-size factors--i.e., 2', 3', 4', and 6--that we 
are getting at structural components which are es- 
sentially independent of gross volume. 

To complete the oblique-factor picture, we must 
consider the second-order factors which this solu- 
tion yields. Factor analysis of the factor intercor- 
relations clearly yields two, and no more than two 
significant factors. Stable communality estim , 

i ates 
for the first-order factors were obtained by sev 
iterations of the centroid procedure. Rotati iem 
the ultimate centroid solution, wearriveat е тош 
tors which are nearly orthogonal. The centr ac- 
matrix, rotated reference-vector ma trix : ees 


156 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE III 
OBLIQUE ROTATED REFERENCE VECTOR MATRIX 


————————————Є 


Rotated Factor 


Variable RT ы F Os. 
Number i! 2' 3! 4! 5! 6! 
2 30 01 -01 21 -01 оо 
3 34 00 01 01 00 2 
4 27 001 44 -0 01 01 
5 33 07 -01 -03 -01 02 
6 35 -19 -01 01 00 -02 
T 01 -04 02 -05 35 02 
8 -03 02 -01 00 37 -10 
9 01 -04 -21 05 32 0 
10 -06 -05 01 05 41 02 
11 -08 12 00 00 41 02 
12 -02 -01 -01 27 01 -32 
13 03 -01 -93 40 -01 -02 
14 -02 -06 -02 60 -01 -01 
15 -02 19 01 31 Ol o 
16 00 01 -50 00 00 24 
17 01 -03 00 00 -02 47 
18 -01 61 00 ^01 00 99 
19 00 -02 тт -01 бо o2 
20 700 15 52 00 00 OO 
21 - зво оз o o 
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TABLE IV 


TRANSFORMATION MATRIX FOR OBLIQUE 
REFERENCE VECTOR SOLUTION 


Rotated Factor 


Centroid 

Factor 1" 2' 3! 4! 5! 6' 
1 11 00 23 08 17 -13 
2 08 -01 -29 -05 15 19 
3 -11 05 -91 34 02 -30 
4 51 29 -14 37 -53 76 
5 83 -41 07 -25 -82 -46 
6 -13 86 -07 -82 06 26 


TABLE V 


CORRELATIONS AMONG OBLIQUE 
REFERENCE VECTORS 


Å 
Rotated Factor 


Rotated 

Factor п 2! 3' 415 g 
—— E 
1! -31 10 06 -93 01 

2! -17 -48 24 62 

3! -29 -01 03 

4! -03 06 

5' -01 

6' 
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TABLE VI 


OBLIQUE FACTOR PATTERN MATRIX 


SS een 


Rotated Factor 


Variable 1! 2! 3! 4! 5! 6! 

DII з... 2-с. м4 
1 26 02 ‚00 00 16 04 
2 88 .02 -.01 39 -.03 00 
3 1.00 .00 .01 02 00 53 
4 79 02 .61 -.02 03 .02 
5 97 „17 -.01 -.06 -.03 04 
6 1.03 -. 47 -.01 02 00 -.04 
7 03 -.10 .03 -.09 99 04 
8 -.09 05 *, 01 00 1.05 -.19 
9 03 -.10 -.29 09 90 04 
10 -.18 =. 12 01 09 1.16 04 
1 -.23 30 00 . 00 1.16 04 
12 -.06 -.02 -.01 .50 03 -.62 
13 . 09 -.02 -1.28 ‚14 -.03 -.04 
14 -. 06 -.15 -.03 1.10 -.03 -. 02 
15 -.06 47 01 57 03 .02 
16 .00 02 -.69 00 . 00 47 
17 03 -.07 00 00 -.06 91 
18 -.08 1.52 00 02 00 1.93 
19 00 -.05 1.06 -.02 . 00 04 
20 -.03 .37 72 . 00 . 00 .00 
21 -.06 95 03 .06 .03 02 
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TABLE VII 


OBLIQUE FACTOR STRUCTURE MATRIX 


Rotated Factor 


Variable 1! 2' 3' 4' 5' 6' 
1 95 12 -13 01 99 -08 
2 91 57 22 52 77 -49 
3 86 -15 -30 -22 87 28 
4 78 61 58 51 65 -52 
5 99 38 02 21 88 -30 
6 90 -14 -30 -20 90 13 
7 86 -11 -26 -20 98 12 
8 92 26 -03 13 97 -26 
9 83 -14 -43 -20 94 15 
10 86 -02 -18 -06 99 04 
11 90 28 -01 15 97 -21 
12 18 8 6 9 0 -94 
13 24 -13 -79 -06 25 11 
14 05 75 62 98 -10 -67 
15 18 94 67 96 00 -81 
16 -07 -79 -93 -76 09 82 
17 -24 -87 -55 -73 -07 99 
18 -03 -10 -10 -11 00 59 
19 -07 54 99 59 -17 -48 
20 05 81 95 Т -11 -"i 
21 24 99 64 83 05 -84 
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transformation matrix are shown in Tables IX and 
X. 

Since factor resolution was so clear at the first- 
order level, the second-order analysis yields fac- 
tors which are quite readily interpretable--perhaps 
more readily interpretable than the first-order 
factors. The two factors correspondto the two fun- 
damental dimensions which we should expect to de- 
termine most of the physical differences among 
eggs. Factor I'is a general size factor, combining 
the contributions of factor 1' (volume) and factor 5' 
(weight). | 

Factor II! is a general shape factor whic h may 
be characterized as a ratio of length to width, how- 
ever these are measured. The positive loadings 
for 3' and 4! represent the positive contributions of 
linear length and a component of this. The positive 
loading for 2' and the negative loading for 6' repre- 
sent the negative contributions of breadthwise cir- 
cumference and maximal linear breadth. 


The Orthogonal Solution 


An initial orthogonal solution was performed by 
the quartimax routine. The final solution shown in 
Table XI was obtained through ae cote jon 
from graphic plots. (The correspon ing transfor- 
mision inier de shown in Table XII. ) Orthogonal 
rotation proved more laborious than oblique rota- 
tion, since it yields no solution which canbe 
regarded as unique. In interpreting the present so- 
lution, it is important to consider how it differs 
from the chief alternative solutions. 

In the solution shown in Table XI, two general 
factors are evident--A and F. A would seem to Bei 
a general size factor, which incorporates the тир 
ings of oblique factors 1' and 5'. It corresponds to 


factor I' of our second-order solution. F, on the 
other hand, is a general shape factor which corre 
d-order factor H'. The loadings 


Sponds to secon 
clearly mark it as a factor of length vs. breadth. 


Factor B contrasts the simple volumetric var- 
iables (which have negative loadings) with the ratios 
formed by dividing weight by each of the volumetric 
measures (with positive loadings). As a positive 
function of density and a negative function of volume, 
B might best be characterized as a “сошрасіпева 
factor. Factor С is apparently a reflected i iia 
ent of oblique factor 3' and may be similarly s 
preted. Factor D, on the other hand, is pé toe 
to oblique factor 6'. In the case of both С m e 
equivalence is one of pattern. Reference 4^ = 

indicates that the orthogonal factors are M 
means collinear with their oblique counterpar , | 

Factor E is опе of the most difficult to interpret. 

t is a negative function of the lengthwise-cireum 
ference variable and also, to an extent, of the 
Weight variable, But the total loading pattern aor 
not Support interpretations in terms of size, vogni 
Or length as such, Itis probablybest considere 
SSSentially a product of variance which iS specific 


the lengthwise-circumference variable. 
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There are several alternative solutions whichap- 
pear about equally satisfactory. Each involves a 
redistribution of the variance of one of the general 
factors with respect to a smaller-variance factor. 
Actually, there are only two planes in which we сап 
make sizable shifts without detracting from the clar- 
ity of the factor structure. We can rotate factor B 
with factor A, and we can rotate factor E withfactor 
Е. Ineither case, the general factor will remain 
essentially unchanged in content, sinceitis only the 
low-loading variables that move into and out of the 
hyperplanes. Our interpretations of factors B and 
E, however, will be altered. n 

It is possible to rotate factor B either clockwise 
or counterclockwise with A. If we shift in one di- 
rection we augment the positive loadings for B and 
pull the negatively loading variables into the hyper- 
plane. This gives us a factor of somewhat simpler 
meaning, which we may identify as “density.” If 
we shift in the other direction, the positive loadings 
drop to insignificance and we are left with more 
substantial negative loadings for the volumetric var- 
iables. The factor then has a less distinct meaning. 
It would appear to be a negative function of that por- 
tion of volume which does not contribute tooverall 
mass, or of volume with weight variance partialed 
out. 

For factor E, there is one alter native position 
attainable through rotation with F. The shift will 
give us positive loadings for the following variables 
(in order of descending magnitude of loading): 14, 
19, 15, 12, 20, 2, and 4. Variables 16 and 5 will 
have negative loadings. The meaning of the factor 
becomes no less obscure. The essential effect of 
the rotation seems to be a shift inemphasis from 
variance specific to the lengthwise circumference 
measure to variance specific to the linear length 
measure. 


Discussion 


The purpose of this study was to secureoblique 
and orthogonal solutions independently for a com- 
mon set of original data, so that we might make 
meaningful comparisons and relative judgments that 
would be applicable to the two types of rotation as 
they are ordinarily performed. Insofar as rotation 
itself is concerned, we have performed the neces- 
sary operations. The skeptical reader is urged to 
repeat these operations to satisfy any doubts about 
the adequacy of the solutions. Criticism is more 
likely to be applied, however, tothe original choice 
of variables. A common criticism of oblique rota- 
tion is that apparent obliqueness is a consequence of 
an inadequate sampling of the population of relevant 
variables. 

This criticism can rarely bemetina completely 
definitive way, since it is not possible to circum- 
Scribe the total population of relevant variables un- 
less we define the object of analysis itself in terms 
of a particular selection of variables, Inthe pre- 
sent case, this would meanthat our consideration of 
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TABLE XI 


ORTHOGONAL ROTATED FACTOR MATRIX 


——— nn; .ac<sE|— _ 
Rotated Factor 


Variable A B C D E F 

2 81 -22 06 -03 00 55 
3 90 -23 03 25 00 -22 
4 69 -18 -41 -03 -03 57 
5 91 -23 04 -01 -18 28 
6 93 -24 05 -05 00 -21 
7 99 12 00 -01 -04 -19 
8 96 15 02 -12 -10 18 
9 93 11 23 00 -02 -21 
10 98 18 01 -01 04 -07 
1 95 19 01 00 -10 20 
12 03 00 01 -32 04 94 
13 


14 -07 00 07 -02 39 91 
15 02 00 01 01 05 99 
16 07 01 50 25 01 -82 
17 -07 -01 02 48 22 -84 
18 00 01 01 99 -о1 -10 
19 715 00 -75 00 16 бү 
20 709 00 -52 -oi - 84 
21 
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TABLE XII 


TRANSFORMATION MATRIX FOR 
ORTHOGONAL SOLUTION 


ААА 
=—— MM M—M—M——- 


Rotated Factor 


Centroid 

Factor A B c D E F 
LL 
1 75 -02 -21 -16 -03 63 
2 66 00 30 18 -04 -64 
3 -04 07 90 -28 -09 30 
4 -03 -53 21 76 18 27 
5 -02 -84 -05 -46 -23 -16 
6 -05 11 -04 27 -95 07 
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the structure of eggs would be limited from the out- 
Set to those individual differences whichare a func- 
tion of our six fundamental measurements. Even 
without such an arbitrary delimitation of the prob- 
lem, however, we can argue that all gross varia- 
tions in any ordinary sample of eggs canbe express- 
ed in terms of various functions of our six original 
variables. Sucha claim could not ordinar ily be 
made with respect to any realm of psychol ogical 
data. 

The use of experimentally interdependent varia- 
ables provides further grounds for Criticism. One 
might question whether it isl 
tor analysis at all to a corr 
ily constituted by the vario 
dependent measures. 


not constitute such a Serious problem, for the logi- 
cal distinction between Specific and common 
variance is an arbitrary one. In application, the 
dividing line is a f i 

variables, 


) 1 Onceptually, ever 
r must be bipolar as long as we сн а md 
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metric model composed of bipolar coordinate enis 
Even when negative loadings are absent from a Ses 
lution, the negative pole of every factor has an pe 
plied meaning opposite to that of the positive p Es 
In some realms of measurement, to be s ur ee 
quite readily achieve positive manifold. Ed eee 
not happen because the ‘‘real underlyi ng ees vnnd 
are unipolar, however, but because our intel ec = 
habits lead us to score variables ina direction maint 
ducive to this. In the physical realm, the orat 
venient operations of measurement are nat а, 
not always) consistent with these intellectua ane 
In the psychological realm, the direction е 
ally а more arbitrary thing, апа often our 1 Thus 
tual habits dictate inconvenient operations. re of 
an individual time score is actually a measu ust 
Slowness. To secure an index of speed, we {е 
perform a transformation. The res ultant positi d 
manifold in a speed-loaded ability factor is thus an 
artifact. There is really nothing in the nature a 
objective reality that compels us to measure Spee 
rather than slowness, intelligence rather than stu- 
pidity, or even bigness rather thansmallness. Fur- 
thermore, both extremes are necessarily implicit 
in whatever factor we derive, even though we take 
pains to make only pole of the factor explicit. 

Much of the debate regarding rotational methods 
centers about the economy of either type of model. 
It is decidedly easier to attain simple s tructure 
through oblique rotation. In this problem, oblique 
rotation yields a clear simple structure. The or 
thogonal solution fails to do so in that each of the 
low-variance factors tends to share its high load- 
ings with one of the two general factors. This ne 
not an unusual situation. It can probably be sa id 
that for most of the kinds of problems to which fac- 
tor analysis has been applied, or thogonality and 
complete simple structure are incompatible goals. 

We cannot make a clear choice of methods, how 
ever, on the basis of economy alone, for the oblique 
Solution by its very nature introduces greater com 
plexity with respect to the interrelationships among 
factors. The oblique solution best furnishes v 
might be called factor simplicity (i. e., s imp š 
structure), while the orthogonal solution provider 
what we might designate model sim plicity. Po А 
factor simplicity апа model simplicity are undoub, 
edly desirable, but since they are incommensurab 
quantities, there is no way of clearly determining 
whether the oblique or the orthogonal solution pro 
vides greater economy. ^ 

With respect to the clarity of coordinate-axis pO" 
Sitions, or the uniqueness of the sol ution, the sat 
lique solution is definitely superior. It furnishes 
One best set of factors which align themselves nea 
ly with trends evident in the test configuration. The 
location of orthogonal axes is more arbitrary. 

Can be argued, on this basis, that orthogonal fac- 
tors will manifest less reproducibility. This 18 
most clearly the case when independent sets of ro- 
tations are made from the same original centro? 

factors. Two oblique rotaters working indepen- 


dently from the present centroid matrix should ob- 
tain nearly identical solutions. We do not have the 
data necessary to indicate how reproducibility will 
be affected when we use different samples of vari- 
ables and persons (or objects) in a sec ond factor 
analysis. 

As it happens, egg measurements were earlier 
analyzed by Muhsam (4), but the semi -artificiality 
of this study and of Muhsam's limits the meaning- 
fulness of any comparison. Muhsam’s final solution 
contained factors which we can identify as simple 
length, simple breadth, and volume with simple lin- 
ear length and breadth partialed out. These most 
nearly match factors in our oblique solution, but 
this is to be expected since Muhsam's solution was 
also oblique. There isa great need for studies 
more deliberately planned to yield information re- 
garding the reproducibility of oblique and orthogon- 
al factors. Insofar as an unambiguous solution 
within each individual factorization is a prerequisite 
for reproducibility, the present evidence argues for 
the greater reproducibility of oblique factors. 

We may also compare the solutions with respect 
to the content of the obtained factors. The relation- 
ship which we anticipated initially is borne out well. 
The oblique solution culminates intwo second-order 
factors. In meaning, they are obviously equivalent 
to the two general factors found in the orthogonal 
solution. It may be concluded that we can ordinar- 
ily expect general orthogonal factors to be parallel- 
ed by second-order oblique factors, provided that 
one permits sufficient obliqueness for well-defined 
second-order factors to emerge. Tothis extent, 
oblique and orthogonal solutions tend to be inter- 


changeable. А 
Perhaps по one will argue thateither solution 
gives us ће “real'' underlying factors whereas the 
a setof 


other does not, for in either case we obtain 
factors with which we can account for our О 
data. The usefulness of a factor, however, depends 
Somewhat on its clarity of meaning, or on theease 
with which we can interpret it. There is no sharp 
difference between the two solutions in this respect, 
though it appears to the writer that the oblique fac- 
tors on the whole are easier to interpret and simp- 


ler in content. | 
It might be expected that an orthogonal so lution 


would afford less interpretive confusion of factors 
This assumption appears to be erroneous. Linear 


independence as such does not make for greater in- 
In the oblique solution, 


terpretive distinctness. А 
factors 1' апа 5' аге most highly correlated. А5 
one would anticipate, they are close ly related in 
meaning. Yet the meanings are clearly distinguish- 
able, and the relationship between them makes good 
Sense. Some pairs of orthogonal factors (e.g.,€ 
and E)are no more distinct in meaning, even though 
hey manifest linear independence. | 
One further difference between the two solutions 


riginal 


* Because R-technique factors are expla 
We cannot always expec 
9f the individual object. 


COAN 


t factors to correspond this simply, 
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may De noted, but its implications are difficult to 
evaluate. Both solutions tend to yieldfactors which 
are essentially composed of the specific variance 
which we have transformedinto common-factor 

variance, but the oblique s olution yields these 
“transformed specific factors’? in more nearly pure 
form. Thus, the significantly loading variables for 
oblique factors 2', 3', 4', 5', and 6'are essentially 
functions respectively of the original measurements 
listed above as f,d,b,a,andc. We cannot properly 
make any relative judgment regarding the adequacy 
of the solutions on the basis of this finding. If we 
had originally chosen six measurements which were 
not intercorrelated and incorporated their various 
ratios into the score matrix, we would inevitably 
have arrived at six such “transformed specific" 
factors. Any rotational method would have led us 
to the six uncorrelated factors which we had thus 
created. In the present problem, we have confound- 
ed what were originally сот топ-ѓасіог variance 
and specific variance, and we can make no a priori 
judgment as tc how these should ideally be parcell- 
ed out among a Set of rotated factors. 


One further matter that merits our attention is 
the occasional argument that oblique rotation yields 
inter-factor correlations that do not meaningfully 
reflect functional relationships within the individ- 
uals or objects studied. Thus, Thurstone’s box 
problem yields oblique factors ofheight, length, 
and breadth. Since height, length, and breadth are 
orthogonal components of any individual box, it 
is the nature of the population sampled that makes 
the corresponding factors oblique. The underlying 
problem here is actually one that pervades all fac- 
tor solutions. For in R-technique, the correlations 
among variables and among factors are necessarily 
a direct function of covariation running through the 
population and only secondarily a function of func- 
tional relationships within individuals. Perhaps it 
would be useful to distinguish between object factors 
and population (or sampling) factors. We can think 
of the box problem as containing three object factors 
and one population factor. These coexist in three- 
dimensional space, since the popu lation factor 
(“general size concomitance") is mathematically 
expressable in terms of the three basic measures 
that underlie the three object factors. 


If we think of Thurstone’s second-order factor 
as a population factor that imparts covariation to 
height, length, and breadth, it is understandable 
that only an oblique three-factor solution will yield 
the patterns of the three object factors (whichare 
linearly independent within objects) in pure form. * 
To produce a comparable result or thogonally, we 
should have to add a null centroid factor and rotate 
in four dimensions, even though centroid extraction 
clearly yields only three factors. If we settle for a 
three-factor orthogonal solution, we must be con- 


natory constructs adduced to account for inter-individual differences, 


in nature or number, to the physical dimensions 
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es with one another. 
that oblique factors 


based on our knowledge of 
Summary 


À comparison of oblique and ortho 
Solutions was Sought th: 
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in the orthogonal solution. The oblique cenis 
though utilizing a more complex mathe E Boh 
model, afforded better simple Structure. idine 
lique solution was found to be superior in M e 
an unambiguous, or ““unique'', position for the o 
ordinate axes. It is suggested that the oblique 


lution also yields fact ors of greater interpretive 
clarity. 
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DEVELOPMENT OF THE SAMPLE SURVEY AS 
A SCIENTIFIC METHODOLOGY 


PALMER O. JOHNSON 
University of Minnesota 


IN THE FIELD of scientific research today it 
is the statisticians who design the experimental 
programs and observational surveys. Likewise, 
it is they who analyze the results, assess the evi- 
dence and differentiate between that which has 
been clearly establishedand that which still needs 
verification. 


Experiment and Survey 


Two principal means of bringing scientific 
knowledge into being are the experiment and the 
Survey. The real distinction between the survey 
and experiment for determining ‘‘cause-and-ef- 
fect’’ relationships is that in the experiment the 
research worker exercises control over the 
forces that are put in motion while in the survey 
he is investigating forces over which he has had 
no control. In the experiment, the population(s) 
under study are constructed in a particular way. 
In a survey dealing with the same problem, the 
population under consideration has originated 
from a set of factors whose relation to the forces 
under examination is unknown. It is the exercise 
of this ‘‘control’’ that differentiates experiment- 
ing from surveying. : 

Experimentation gives use to the m ost desir- 
able fruits of the scientific method. In experi- 
mentation we are able to determine the cons e- 
quences of altering factors, e.g., we can find out 
what effect changing a factor A has on anothe г 
factor В such that Knowledge is acquired basic to 
taking action. Surveys can only detect the exist- 
ence of associations between factors in the popu- 
lation. i 

When the aim of the investigation is descrip- 
tive no such general advantages of the experiment 
prevail over the survey. Since in practice the re- 
search worker is not always free to elect the meth- 
od which might seem to be superior, he should E 
ready to employ either method. In some cases т 
may be advantageous to use а combination of 50 
methods. Thus the researcher might use the sur- 
vey to explore and the experiment to study the sit- 
uation in greater detail. At times the survey may 
Serve to identify factors that are worthy of exper- 
imentation. Also surveys аге especially useful in 
Situations in which it is very difficult or perhaps 
impossible to conduct an experiment. This has 
been particularly true in the field of human genet- 
ies. Thus it is that methods in science vary with 


the nature of the problem to be investigated. 
Development of the Sample Survey 


A well-known classical example of a survey is 
a census of population, of which there are some 
intersting historical antecedents. The Book of 
Numbers in the Old Testament is a simple ex- 
ample of a survey, a written record resulting 
Írom an enumeration or counting of the wealth of 
the tribe in terms of persons and animals. An 
early example of an economic survey, perhaps the 
first in England, was the Domesday Book. This 
survey was carried out by William the Norman 
who verified the principle that if a country can be 
efficiently governedit is essential to discern what 
comprises the wealth of that country. 

The U.S. Census required by the Constitution 
has been conducted every ten years since 1790. 
The original purpose of the Census was to ascer- 
tain the number of inhabitants in the United States 
and their residence to furnish the basis by which 
the number of representatives each state would be 
granted in Congress. More recently the Census 
has been made use of for additional purposes 
among which is an important source of informa- 
tion for administration and research in govern- 
ment, business, labor, and other fields. 

In a free and progressive society, busines s, 
the government, the professions, and increasing- 
ly, labor, all are continuously in search of the 
widest and soundest possible factual basis for mak- 
ing decisions, formulating policies, and for devel- 
oping scientific, social, and economic theory. It 
is this joint quest particularly for economic and 
social facts that accounts in a large part for the 
very great significance of and emphasis on statis- 
tics. Hundreds of millions of dollars are expend- 
ed annually by both public and private agencies. 

The traditional method for collecting social and 
economic statistics has been that of complete cov- 
erage and enumeration, i.e., the procedures of 
the Census. Theoretically, at least for those pop- 
ulation characteristics that remain relatively con- 
stant, this practice seems to be the best. How- 
ever, such an undertaking is costly, difficult to 
plan and conduct, restrictedtoa relatively few 
items of information, is time-consuming, andis 
liable to be out of date by the time the findings 
are published. For example, even with modern 
sorting and collecting equipment, it will take sev- 
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eral years before the results of the next Census 
are processed and published, so that if interest- 
ed persons have to wait for the published results 
there would be time for the structure of the popu- 
lation to change. This is a case where the collec- 
tion of too much material may be at times as ob- 
structing and produce consequences as mis] ead- 
ing asthe gathering of too little. It is of interest 
in this connection to note that in the 1940 Census, 
the Bureau of the Census introduced a sampling 
procedure by including a set of su pplem entary 
questions which were answered bya sample of 
one person in twenty. In fact, the statisticians 
of the Bureau of the Census have played arole of 
leadership in the rapid development of the theory 
and practice of sampling in recent years. 

It is the develop 
made it possible to 


social scientists, 
e results of sam- 


collection of many 


as only possible by the use 
of sample surveys. 


The Sample Survey as a Scientific Method 
——s а sclentifie Method 


understandi 
found out about the p à 


* All footnotes Will be foung 3t end of article 


the research worker to make a critical appraisal 
ofthe purposes the statistics collected are to Serve. 
A specification of the population and the d ec ision 
as to the precise purpose of the investigation usu- 
ally determine certain parameters of the population 
to be estimated or certain hypotheses to be tested. 
The problem of inductive inferences is defined by 
the object of the sampling survey to secure accu- 
rate answers to certain clearly defined questions. 
The brief sketch so far made of the sample sur- 
vey has indicated the derivations of a method which 
is the most tractable, speedy, economical, and, in 
reality, scientific method so far available for the 
purpose in mind. Many examples support the first 
three claims. The claim of its unique scientific 


character is well stated by Professor R. A. 
Fisherl*. 


dise: v why do I say that it (the s ample sur- 
vey) is more scientific than the only proced- 


ure with which it may sometimes be in com- 
petition, 


the complete enum eration? The 
answer, in my view lies in the primary pro- 
cess of designing and planning an inquiry by 
Sampling. Rooted as it is in the mathemati- 
cal theory ofthe errors of random sampling, 
the idea of precision is from the first in the 
forefront. The director of the survey plans 
from the first fora predetermined and known 
level of precision; it is a consideration of 
which he never loses sight, and the preci- 
sion actually attained, subject to well under- 


stood precautions, is manifest from the re- 
sults of the inquiry. 


Diverse Fields of Application 
ee MI MBpACaton 


Such a sharp and accurate tool as probability 
sampling did not meet with uniform acceptance and 
use in all fields of research. Even today, there 
isan increas ing need of application of modern 
sampling survey methods to practical situation. 

The sample survey has already provedto be ec- 
onomical, of high accuracy, and especially adapt- 
able in comparison with older methods, in fact 
finding in economics, vital statistics, and in the 
programs of the U. S. Bureau of Census. 

In the field of productive industry, the Quality 
Control Engineers have developed special uses 
known as sequential sampling by which they have 
demonstrated how the efficiency of mass-produc- 
tion may be reconciled with increasing demands 
for precision and reliability of product. The as- 
Sessing of consumer preferences has made consid- 
erable progress through the use of s am pling and 
it has become possible to specify these actual re- 


quirements on to design and production engineer- 
ing. 
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Some beginnings have been made on the nation- 
al scale using probability sampling methods in the 
collection of educational statistics and occasion- 
ally by individual investigators of educational 
problems. However, most workers in this field 
as well as in psychology, sociology, and in other 
social sciences seem to be unfamiliar with or, if 
familiar, not using modern sampling methods. It 
is the status of the sampling problem in social 
science fields that motivated, atleastto some ex- 
tent, the writing of this paper. 


General Methodology in Sampling Surveys 


The planning of sampling designs is usually 
involved in two situations: experimental investi- 
gations and descriptive or analytical surveys. It 
is only the latter situation that concerns us in 
this study. It may be pointed out here, however, 
that the sweeping theoretical and technical ad- 
vances leading to the principles of modern exper- 
imental designs produced a startling advance 
as well in the development of sam pl ing designs 
and techniques. 

Turning our attention now more specifically 
to the sampling survey, we find that the statis- 
tician must concern himself with determining the 
number of observations to be drawn from the pop- 
ulation and what method of sampling should be 
used. There is also the practical problem of se- 
lecting the method which will provide the desired 
degree of precision at minimum cost. The cen- 
tral problem of estimating an unknown parameter 
of the specified population is one of finding a func- 
tion of the observations that is the best estimate 
of the parameter. These statistical aspects as 
well as others will be encountered in a number of 

laces in the sequel. 
x Probability and Judgment Samples— It should 
be emphasized that we are dealing with probabil- 
ity and not judgment samples. In pro b ab ility 
sampling, there are these distinguishing fea- 


tures: 


1. Every individual (primary sampling unit) 
in the sampled populations has a known probabil- 
ity of being included. _ 

2. The sample is drawn by a process шш: ne 
cessitates one or more acts of automatic ran os 
ization conformable with the probabilities inn| 


r D appropiate to the probabilities in 
number one are applied in the analysis of the 
sample results. 

ili les may be self-weighting. 
De i А a es samples are drawn such 
that each individual (sampling unit) in the popila- 
tion has an equal chance of being included int a 
Sample. In such a sampleif it is desired to esti 
the arithmetical mean of the population of some 


Characteristic, the proper procedure is to calcu- 
late the unweighted mean of all the members of the 
sample. Since here the weights are equal the 
sample is said to be self-weighting. This situa- 
tion satisfies the criteria of a probability sample 
since the relative chances of different individuals 
being included in the sample are known and taken 
into account in the weighting (being inthiscase 
equal). However, this is not the only type of a 
probability sample. Instead of giving each mem- 
ber of the population an equal chance of being in- 
cluded in the sample and then being weighted equal- 
ly, a process of compensation may be used where 
those individuals moreliable to be included in the 
sample are given less weighting while those less 
likely to enter the sample are given more weight- 
ing when they do occur. In this kind of probabili- 
ty sample, called general probability sampling, 
each individual item is given an equal chance of in- 
fluencing the (weighted) sample mean. 

In judgment or non-probability sampling, no 
chance system enters into the selection of the sam- 
pling elements. Thesample is restricted to units 
believed by someone to be particularly typical of 
the population or are chosenfor their convenience, 
e.g., “grabbing a handful" or takinga‘‘chunk”’ 
out of the population. Such samples vary greatly 
in actual and apparent trustworthiness. This type 
of sampling makes impossible the measurement of 
the precision of the sample results from the sam- 
ple itself. The probability that an individual is in- 
cluded in the sample is unknown. No objective 
basis is known for measuring the confidence that 
can be placed in the estimates provided by such a 
sample. 

Sampling and Non-Sampling Errors— We speak 
of sampling procedure as unbiased if the mean of 
the frequency distribution of the estimates that it 
produces is exactly equal to the population param- 
eter that is being estimated. 

By sampling error or the precision of asample 
result is meant how closely we can reproduce 
from a sample the results which could be obtained 
if a complete count of the population were made 
under the same conditions. The difference be- 
tween the sample result and the true value (popu- 
lation parameter) is called the accuracy of the 
sample survey. We are the most interested in the 
accuracy but it is the precision that is mostfre- 
quently measured. Thestatisticianaims to set up 
the sample design such that the combined effect 
of the accuracy and precision will beata minimum. 

As a working basis, it is often stated that the 
effect of bias on the accuracy of an estimate may 
be taken as negligible if the bias is less than one- 
tenth of the standard deviation of the estimate. 
The standard deviation of an estimate as calculat- 
ed from the sample does not contain the contribu- 
tion of the bias. However, any biased method 
must be interpreted with caution. There may al- 
so be bias in the estimate, which is unsuspected. 
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It is not possible to anticipate in any specific 
circumstance the magnitude of the error present 
in the estimate made. This determination would 
necessitate a knowledge of the population value. 
The standard method in statistical theory by which 
precision is assessed is that of investigating the 
frequency distribution generated for the estimate 
by repeated sampling from the same population. 
A useful simplification results from the assump- 
tion often observed in practice that the sample es- 
timates are approximately normally distributed. 
Accordingly, a measure of sampling error is ob- 
tained by calculating the sampling variance of the 
estimate the reciprocal of which provides a meas- 
ure of its precision. 

Much of sampling theory is concerned with the 
derivation of formulas for the sampling variances 
of estimates secured by a variety of sample de- 
Signs and procedures. The investigator has to be 
on guard in the use of formulas presumably used 

to measure precision without taking into account 
the method by which the Sample was selected. Ap- 
propriate selection is based on two fundamental 
bases: (1) formulas measuring sampling errors 
Should be based on information of the probability 
of an individual being included in the sample, and 
(2) such formulas are contingent upon the particu- 
lar sampling design applied. 

Among non-sam pling errors in Surveys are 
(1) errors due to non-response resulting from 
lackof measurement of sampling units in the sam- 
ple due to inability to locate some individuals or 
their unwillingness to answer when found, (2) er- 
rors of measurement resulting from unreliability 
and low validity in information provided, and (3) 


errors in processing, editing, and tabulation of 
the sample results. 


eration. Most of the 
has related to ways of re- 


+ However, a beginning 
has been made in m eeting other types of diffi- 


ure to execute directions including entries put 
down by pure guess. These returns are not amen- 
able to statistical or probability analysis. е 

The margin ої error of the final estimate thus 
involves sampling fluctuations, observational yas] 
rors falling partly within the scope of the classi- 
cal theory of errors, andinaccuracies due to false 
entries or gross negligence on the part of the in- 
vestigators or respondents. The latter type may 
also contain other systematic errors. ЕРТЕ 

While it is practically impossible to ap Б 
whole survey enterprise under what we call s ves 
tically controlled conditions by elio i 
tematic errors, it is possible to provide stati St 
cal controls for detecting and guarding а 
many recording errors. Such ways, for example, 
would be the conduct of two or more independ- 
ent surveys and the use of inte rpenetrating sam- 
ples. Acomparisonof the different investigations 
would reveal the magnitude of recording mistakes 
and when more than two sets of records existed it 
would enable unreliable workers to be identified. 

Every effort must be made to obtain complete 
information for every member of the sample. This 
includes such plans as following up a random sam- 
ple of delinquents. Even with this effort, it is us- 
ually impossible to secure complete coverage, par- 
ticularly in human Sampling, since some persons 
cannot be found, others are unable or unwilling to 
respond, and a few may have died. It is desirable, 
therefore, to distinguish between the exact popula- 
tion which has been sampled and the population in- 
itially defined in the plan of the investigation. 
These are called the s am pled population and the 
target population, respectively. 

Population and Sampling Unit— An aggregate of 
individuals that possess a common character o r 
characteristic may be termed a population. Ina 
sample survey the populations with which we are 
concerned contain a finite number of units. This 
Situation differs from the conception of the infinite 
population which plays a dominant role in statisti- 
cal theory. The difference in populationsleads to 
different methods used to prove theorems. The 
results are slightly more involved when sampling 
is from a finite rather than an infinite population. 
The differences, however, are seldom important 
for practical purposes. The conditions of an infi- 
nite population is assumed to be fulfilled in prac- 
tice by sampling with replacement, 2 

The population we study may be largeor small, 
but there must be a clearly defined population to 
begin with. What we Study is some aspects or 
Characteristics of the population. It is the popu- 
lation we wish to characterize from the informa- 
lion obtained from the sample, which is usually à 
Small part of the population. We do not carry out 


sampling studies to learn about the properties of 
individuals. 


A population may 


А at times be divided into units 
in a number of ways 


+ For example, we may con- 
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sider a city as comprised of a numberof city 
blocks, as an aggregate of households, or of a 
number of persons. Since an alteration in the 
type of unit commonly affects both cost and 
precision of the sample, the selection of the best 
type of sampling unit is usually significant in the 
economies of sampling. 

We could selecta sample of residents of a city 
by compiling a complete list of all the residents 
and then by selecting names of individual resi- 
dents at random from the list. If we prepared a 
list of the blocks of a city (by the help of a map, 
say) and then selected a number of blocks at ran- 
dom, we would obtain a random sample of blocks, 
in each of which there would be one or more resi- 
dents. The sampling unit in this case, namely, 
the block, is sometimes callea an “area” or 
“cluster” sampling unit. 

From a knowledge of certain facts, it becomes 
possible to ascertain for a fixed cost or fora 
given accuracy, the optimum type of unit for à 
given investigation. Some techniques are avail- 
able for obtaining information about the optimum 
type of sample unit. For example, the analysis 
of variance can be used in the comparison of the 
precision of large and small sampling units. 


Basic Principles of Sample Surveys 


btain the kind of impartial, well- 
founded, and systematic knowledge at which the 
sample survey method aims, principles of design 
have been built up. The precision of the results 
procurred from the sample survey is contingent 
not only on the size of the sample but also on 
other aspects of the sample design, such as the 
way the sample is chosen and the process of cal- 
culating the estimates from the survey results. 

Usually there aré a plurality of alterna ti ve 
sample designs that might be usedin a particular 
problem and a comprehension of alternative de- 
signs including a contrast of their relative effi - 
ciencies is required if an appropriate selection 
is to be made. For this purpose modern M 
pling theory is providing powerful tools. Food 
principle previously referred to of specified P q 
cision at minimum cost enters repeatedly іп mod- 
ern theory. 

We will now go from our di 
eral methodology in sampling surveys 
Specific sampling designs. 


In order to o 


scussion of the gen- 
to certain 


Sample Designs for Some Common 
Sampling Problems 


In planning sample surveys onepröcedis M e 
cordance with fundamental principles to fi а 
Cific design to a projected investigation. Ta 
are no general rules leading to the selection at 
design. Each problematic situation presents E 
Own problems. A practical working guide is to 


use the simplest design best meeting the neeas of 
of the inquiry. This Goes not exclude the use of 
complex designs when these best serve the investi- 
gator’s purpose. The sampling plan should be rep- 
resentative. The plan must include the way in 
which the sample is to be arawn, the relative 
chances for the selection of any two possible sam- 
ples, and the analysis specified which is to be 
used on the sample results. 

We can describe only briefly the main types of 
designs. We do so with the purpose of giving the 
interested reader an insight into sampling survey 
procedures with a minimum of mathematical sym- 
bols and of unexplained technical terms. 

Simple Random Sampling— This is the most ele- 
mentary type of sampling problem. In this design 
every element of the population has an equal 
chance of being included in any sample and when 
the chance is unaffected by the corresponding 
chance for any other element, the process is 
called a random sampling procedure. The result- 
ing samples are “random samples". The term 
“random” implies that all possible samples of a 
given size have the same probability. 

In practice it is often difficult to obtain a ran- 
dom selection of elements from a population. It is 
not sufficient that the selection be haphazard. We 
must be certain that the method of selection and 
the values of the variable in the population are un- 
related. Where the population can be enumerated, 
however, it becomes possible to select a random 
sample by use of a table of random numbers. 

Since each element of the population has а 
known (in this case, equal) chance of being includ- 
ed in the sample, it is observed that simple ran- 
dom samples are a special case of probability 
samples. Unless we know something more about 
our population we cannot do better than to select а 
simple random sample. Ithas the unique advan- 
tage that the precision of the estimates can be de- 
termined objectively without making questionable 
assumptions. One main objection to simple ran- 
dom sampling was the cost of carefully designing 
a satisfactory procedure which could be effective- 
ly carried out. This difficulty led to the improve- 
ments in pure random sampling procedures that 
reduced the costs of sampling. 

The introduction of the principle of randomiza- 
tion and of the analysis of variance as the tech- 
nique of analysis of sample observations has made 
possible the attainment of unbiased estimates of 
the quantities under survey andof determining the 
errors to which the estimates are subject. The 
analysis of variance, through making it possible 
to pool estimates of error and to separate compo- 
nents of error that are not homogeneous, has 
brought a drastic reduction in the number of inde- 
pendent sampling units required to be taken from 
each quantity of sampled material. This has 
made possible the evelopment of sampling de- 
signs frequently involving samples in two or 
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more stages. 


Stratified Random Sampling— This method of 
sampling utilizes supplementary information to 
obtain greater precision in the sample estimates. 
The population is first divided into Sub-popula- 
tions called strata and from each stratum are 
drawn a pre-determined number of observations 
by random sampling, the drawings being made in- 
dependently in the different strata. 
Inits simplest form the only necessary re- 

quirement for stratification is that the strata ac- 
tually differ one from another in the mean of the 
characteristic under measurement. If the di vi- 
sion of the population into strata does not give 
Strata that are homogeneous with respect to the 
Characteristic under measurement, no gain will 
result from stratification. The basic idea in this 
method is thatitmay be possible to break down a 
heterogeneous population into relatively homogen- 
eous Strata. A precise estimate can be obtained 
for each stratum and the several estimates can 
be combined into a precise estimate for the total 
population. 

The main part of the theory of stratified ran- 
dom sampling is concerned with the properties of 
the estimates obtained from this method and with 
theoptimum choice of the sample size for the sev- 
eral strata. Proportionate stratified sampling 
employs a uniform fraction in the sample from 


each stratum. This procedure gives a self- 
weighting sample. 

It is not necess ary, however, that the same 
proportion be taken from each of the strata. A 
fundamental theorem gives results leading to the 
optimum location of the Sampling elements in the 
Several strata. Thistheo 


relate optimum allocation to a Stated total cost. 
However, since 

cation of the s y proportionally to the 
ive costs per unit con- 
cerned, Small differences i 


rd deviations, if the cost differ- 
tantial, itis advisable to introduce 
Ocation of Sampling elements to the 


ences are subs 
differential all, 


the strata have been defined, 
In determining the boundaries of strata, effec- 
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tive use should be made of every kind of informa- 
tion helpful in allocation elements of the popularren 
into groups differing from one another in regar 
to the character under measurement or to the ex- 
pense of collecting data. But, of course, enc 
each stratum, the sample has to be a probability 
sample since judgment must not enter into the se- 
lection of the individual sampling elements. я 

Stratification should be regarded as only one o 
the means of sample designs to be taken into con- 
Sideration with the aim of increasing the amount 
of information per unit cost. Another design 
which is sometimes more important in reducing 
costs is that of cluster sampling. I side 

Cluster Sampling—We have previously di 
ered several ways in which a population can be sid 
vided into units and have stated that a change in 
the type of unit bears a close relation both to sam- 
pling costs and the precision obtained. 

The term **elementary unit" denotes an individ- 
ual member of the particular population. This is 
the element on which measurements are desired 
and in the aggregate constitute the materials upon 
which the analyses are made, such as the deter- 
mination of averages and percentages. The ele- 
mentary unit is determined by the objectives of the 
survey and depends upon the analysis to be made. 
For example, we may wish to determine the medi- 
an teacher’s salary or the average family income. 
In the former case the individual teacher is the el- 
ementary unit, in the latter the family is the ele- 
mentary unit. The elementary unit is determined 
by the purpose of the Survey and not by the sam- 
pling design. 

At times the objective of the survey does not 
necessitate the designation of an el ementary unit 
since only aggregates are to be measured. Again, 
several elementary units may be utilized in the 
Same survey. Such is the case where both individ- 
ual and family traits are estimated in the same 
sample survey. 

In cluster samplingone of the leading practical 
problems is to lay out and define the clusters. 
The population of elementary units under consider- 
ation is divided into groups or clusters, which are 
the primary sampling units. 

Cluster sampling may involve single or mul- 
tiple stage sampling. In Single stage cluster sam- 
pling, asimple random sample is taken of the 
Clusters into which the population has been divid- 
ed. Since the cluster is comprised of a cluster of 


the units of observations, it is not necess ary to 


measure all of the units that make up the sampling 
unit. We may, 


therefore, select and measure à 

sample of the elements in any cluster. The term 

"'subsampling? is sometimes applied to this tech- 
nique since the primary sampling unit (the clus- 
ter) is not measured entirely, but is itself sam- 
pled. The term we apply here is two-stage sam- 
pling since the Sample is taken in two steps- 
This type of sampling design involves the follow- 
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ing procedures: 


1. The primary units are selected by simple 
randoin sampling. 

2. The second-stage units are chosen by sim- 
ple random sampling from the primary units in 
number 1 above. 

3. Uniform fractions are applied in number 2 
to all the primary sampling units selected. 


The processes can be extended to any number 
of stages. Each unit that falls into the sample at 
any particular stage is subdivided into units in 
preparation for the next stage. Thus, in three- 
stage sampling there will be primary, secondary, 
and tertiary units. Sometimes four stages are 
used and occasionally more. An example of a 
four-stage cluster sampling would be: first stage, 
sampling of the 87 counties in Minnesota; second 
stage, sampling of the cities or towns within the 
counties; third stage, sampling of schools within 
counties; fourth stage, sampling of the classes 
within the schools. 

In multi-stage sampling, there must be a 
frame (roster) or one must be constructed for 
every sampling unit that is to be sampled. Thus, 
to start with, there must be a frame that de- 
scribes all the primary units in the population. 
Then, for every primary unit that falls into the 
sample there must be a frame (or one must be 
compiled) that describes the secondary units. 
For each secondary unit that falls into the sample, 
there must be a frame that describes the tertiary 
units, and so forth. At each successive stage in 
the sequence, the sampling units become smaller 
and the frames become more and more detailed. 
In the last stage, the frame is comprised of Ше 
ultimate units, which might be single househol 5, 
several successive households, individual per- 


sons, or small areas. 
Technically, one of th 
tiple stage sampling is that th 


frame for the next stage is ne Š d 
units that have already been drawn into the sam 


le. The subunits within any larger unit have to 
p^ exhaustive; in the aggregate they must асари 
for all of the larger units, so that every eoa 
every school, every business, and so on i re 
and only one subunit at any given stage. uus 
is not true, the probab ility of any = er 
school, farm, etc., being included in » nid 
Will not correspond with the eset eot] p ся 
е mathematical theory is established. = 
the estimates of the errors of random sampling 
will i idated. 
Ар ae almost always has a m 
Sampling error than à simple random sampe я 
the same size. This is due to the Pedes n. 
Sampling groups Or aggregates of indivi u m. 
the population there is usually a positive in at 
Class correlation of the variable withinthe 


e main economics of mul- 
e compilation of the 
cessary only for the 


groups under investigation. 

The sampling unit practically obtainable in ed- 
ucational and psychological research is often the 
class, grade, or some other grouping of individu- 
al pupils. Cluster sampling is an extremely valu- 
able methoa, therefore, in educational and psycho- 
logical research. It is necessary in using it to 
know the conditions and means by which statistical 
estimates and the measures of their sampling er- 
ror may be accurately ascertained. 

The principal advantage of cluster sampling and 
of various multi-stage sampling designs in the 
field survey is in the reduction in travel time. 
However, here as elsewhere the decision of wheth- 
er to use this sampling plan will depend on the rel- 
ative costs and precision to be obtained. 

“Area sampling" is a method of sampling 
which makes use of such means as a clearly de- 
fined map or an aerial photograph of sampling 
units of small or large areas, as the case may be, 
in a particular region, when a definite numbering 
oralist of the sampling units is not available. 
The sampling unit may be the farm in surveys of 
acreage of crops or the individual dwelling units 
incertainsocialsurveys. Here, neither the ident- 
ity of individual farms or dwelling units nor their 
numbers in the areas need be known in advance. 
But after havingobtainedthe maj or the areas pho- 
tographed, we could adopt a numbering procedure 
which could make possible the drawing of a ran- 
dom or probability sample. Further, the map or 
the aerial photo graph could also be used for the 
choice of appropriate sampling units. 

Systematic sampling to be discussedinthe next 
section is a particular case of cluster sampling 
where the sample is a single cluster. 

Systematic Sampling—A convenient form of 
sampling is that which consists of taking the sam- 
pling units from a list of all the sampling units of 
the population. The term ‘‘frame’’ has been used 
by the U. N. subcommission on sampling to speci- 
fy this form of listing ofthe population. Since 
about 1944 there has been considerable theory de- 
veloped concerning this form of sampling, which 
has come to be known as systematic sampling. Ex- 
amples of a frame would be a listing of all the 
rural schools ina state or in the nation, or ina 
large manufacturing plant the complete list of em- 
ployees may be available in card files containing 
characteristics for each employee onan individual 
card. 

Thus it might be of interest to know the propor- 
tion of employees who hadbeen graduated from 
high school. To carry out a sampling study one 
could take a sample, say, of 100 cards from a file 
comprised of 1000 cards. The first card chosen 
would be determined by giving each of the first 10 
cards a number, these numbers placed on uniform 
pieces of cardboard and then put in a hat and thor- 
oughly mixed. One piece of cardboard would then 
be selected at random. This operation is spoken 
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of as making the first entry at random. Then 
one would proceed by taking the card in the posi- 
tion specified by the number and every tenth card 
thereafter until the sample of 100 had been taken. 
For illustration, if the number of the cardboard 
drawn from the hat were 8, the respective cards 
chosen from the file would be 8, 18, 28, etc. 

More generally, let us consider a finite popu- 
lation made up of the elements Xi, Xo... . Xnk 
Where n and k are integers. A systematic sam- 
ple is obtained by choosing an element at random 
from the elements Х\,....,ХК, and then selecting 
every kth consecutive element, i.e.,ifx, istheele- 
ment first chosen, the Systematic sample com - 
prises the elements Xi, Xik,... » Xi«(n-1)k- 

Systematic sampling has found substantial us- 
ages in practice since it is fr equently easier to 
select and administer than is a random or strati- 
fied random sample. This is particularly true 
if the drawing of the 


Sample occurs in the field. 
Then, too, this method possesses a certain intu- 


itive appeal in that the Sample is spread evenly 
over the population. 


Possbily a more 
between systematic 
random sample hav 
Inthelatter method, 
into n strata: (x, РАК 
one Samplin 


dom from each of the strata. 


pertinent parallelism is that 
sampling and the stratified 
ing one element per Stratum. 
the population is Subdivided 

t Xk), Glas... хәр). ..and 
ently at ran- 
The difference is 

the units all come 

the stratum while 
ng, the position of 

designated Sepa- 
n each stratum. 
asier to obtain than 
d it is also likely to 


Study 


units chosen usually give, 


The largest reduction in v; 
when a high Correlation exis 


ariance takes place 
units on the frame with res 


ts between adjacent 
pect to the traits un- 


der measurement and when the serial correlation 
decreases with increase of the interval between 
units. Serial correlationsare frequently found in 
situations where observations vary with time. Ex- 
amples are amount of fatigue of children at Шеге 
ent hours, prices of stocks on different days, en 
temperatures at different times of day. Anot ad 
such phenomenon is variation in plant growth o 
areas of soil with differing fertilities. a 

A principal defect in systematic sampling ы 
that there is по formula for the sampling геѕи г 
which is generally validfor the sampling error T 
the estimate. Various approximations are dira 
able although these commonly give overestima ut 
The random start systematic sample estimate 
the population mean is unbiased. 

A consistent estimate of the variance cannot be 
Obtained from a systematic sample selected with 
à single random start. Some approximate esti- 
mates are, however, very useful for survey re- 
Sults, when periodicities do not occur and serial 
correlations are not high between nearby units in 
the order of picking. 

Perhaps the biggest risk in using Systematic 
Sampling is with data that are periodic with re- 
Spect to the order of the listing in the frame, that 
is, if the interval between units equals the period 
or some multiple of it. This danger in systemat- 
ic sampling from a population with periodicities 
is particularly great since the sample itself may 
afford no evidence of the periodicity. 

Since both substantial advantages and consider- 
able losses may sometime occur in using system - 
atic sampling the research worker needs to know 
Situations when this method may result in materi - 
ally larger or substantially smaller sampling var- 


lances than would alternative methods of random 
Selection. 


More Complex Sample Designs 
—— = ample Designs 


Up to now we have descr ibed the more com- 
monly known and used Sampling designs: simple 
random sampling, stratified random sam pling, 
cluster sampling, and systematic sampling as dis- 
crete sample designs. Often these designs are 
sufficient for the problem in hand. There is à 
class of problems, however, for which a consider - 
ably better solution can be gained by the use of 


combinations of these (and Other) methods and by 
different estimating processes. One stage of the 
same sam 


pling problem may involve one design 
followed by another and so on for several stages- 
Different kinds of probability samples are drawn 


at each stage. Different types of estimates may 
be used. 5 


In addition to the co 
are certain tools oth 
viously, which can 
job of sampling. 
of these briefly. 


nstructionof designs, there 
er than those mentioned pr e- 
result in a substantially better 
We can only mention examples 


Mr 
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Sometimes improved results are attained by 
the choice of more efficient estimators. For ex- 
ample, a method of double sampling might be 
used. This method involves two sampling inves- 
tigations. The first consists in drawing a large 
unrestricted sample from the population deter- 
mining for each sampling unit, the value of the 
character, the collectionof information on which 
is easy and relatively inexpensive. This second- 
ary character is known to be highly correlated 
with the primary character with which the inves- 
tigation is concerned. The collection of data on 
primary character is costly. 

The second investigation consists in drawing 
a small sample in which the values of both the 
primary and secondary characteristics are ascer- 
tained. It now becomes possible to find the re- 
gression of the primary on the secondary charac- 
ter. The predicted value in the regression equa- 
tion corresponding to the difference in the mean 
values of the secondary character in the two sam- 
ples is then used as an estimate of the mean val« 


ue of the primary character in the total popula-- 


tion. 
Another example of a tool leading to an im- 


proved estimate is to permit sampling units to 
be drawn with arbitrary probabilities. Where it 
becomes possible to find a basis for assigning 
the arbitrary probabilities it is possible to make 
very substantial improvement in the efficiency of 
a sample. To obtain unbiased estimater tn Шш 
equal sampling probabilities, the sampling ele- 
ments are weighted by the reciprocal of the prob- 
ability. | | 
We shall conclude this discussion of аря 
designs by describing briefly two sampling Се- 
signs still in rather common use. These бы 
have the semblance but not the substance of prob- 


ability sampling designs. 


Quasi-Representative Sampling Plans 


—]n purposive sampling 


some preliminary segregation of certain Lead 
the population is made and the sampling is " 
Stricted tothese parts. These segregated par p 
are somtimes those thought by the investigat o, d 
or by some solicited authority or expert us 
typical of the population, for instance, there пи y 
be judgment selection with respect to ceri = 
characteristics of the population of typic d 
“representative” counties, cities, ogi с à 
vidual households, blocks and so forth. At o e 
times selection is made because of bep е k 
е is, апу group that might be handy, suc 
class of students. | | 
А more objective method of purposive ae 
tion is restricted to sampling aggregates ii^ x 
have the same average as the population Wi 
Spect to one or more controls. It is pe 
that the entire aggregate should make up the 


Purposive Sampling 


ple. The belief is that since the controls have av- 
erages in the sample the same as those in the pop- 
ulation, the means of the investigated variables, 
assumed to be positively correlated with the con- 
trols, will accordingly be better estimated. The 
distinguishing featureofthis sample plan, then, is 
the restriction of the sampling to the part of the 
population picked on the basis of the control aver- 
ages. The variability of the known quantitative 
characteristic(s), as well as of the other charac- 
ters closely correlated with it, will clearly be con- 
siderably less than the real variability in the popu- 
lation. 

A variant of the above plan is observed in the 
attempt to get a “perfect cross-section" from a 
sample with the last census on certain characteris- 
tics. Thus, it is possible to make up a “sample” 
of persons by adding and subtracting individuals so 
that finally the sample corresponds almost exactly 
with the last census on, say, age-groups, sex, edu- 
cation, economic status, and others. This sam- 
pling plan is very hazardous since it may fail al- 
most completely to agree with the population that it 
was designated to represent with respect to the 
characteristics the survey was contemplated to 
measure. 

Advocates of purposive sampling claim as ad- 
vantages for itthat itis sometimes possible to use 
this method where rando mization is not possible 
and that the enumeration covering selected areas 
or districts would be less expensive. The maindis- 
advantages of the method are that (1) substantial in- 
formation of the population must be had in advance 
of the sample, (2) thecontrols used are frequently 
defective, and (3) the method is not amenable to the 
development of a sampling theory since it includes 
no element of random sampling. One cannot obtain 
from the sample itself an objective measure of the 
precision of the sample estimates. 

Quota Sampling— This method is a variant of 
purposive selection. As it is used in practice by 
a number of agencies, interviewers are given as- 
signed quotas of people of different age-groups, 
socio-economic status, etc., and are instructed 
to secure the specified number of interviews in 
each quota. Added directions proposed to avoid 
excessively unrepresentative selections within the 
allotted quotas are sometimes given. Thus an in- 
terviewer may be asked to secure twelve interviews 
with housewives, who are not em ployed full time, 
who own their own houses, etc. The enumerator 
is instructed to continue sampling until the re- 
quired **quota"' has been secured in each stratum. 
The interviewer does not select the interviewees 
at random. He may take advantage of any knowl- 
edge that enables him to fill his quota quickly. Vary- 
ing amounts of latitudes are allowed the interview- 
er. The interviews are not often carried out by 
house-to-house canvas but may at times be done 
by interviewing in streets or other public places, 
or even now and then by telephone. 
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The objective is to get the advantages oí strat- 
ification without the higher field costs resulting 
from selecting units at random. It is evident 
that, however accurately the quotas are met, 
such samples do not constitute random samples. 
Consequently, sampling theory cannot be applied 
to quota sampling. The accuracy of the results 
must be based on assumptions and judgments that 
cannot be estimated objectively. Accordingly, 
information cannot be obtained from the sample 
results toassay their precision. These methods 
often deal with collecting opinions and their re- 
puted success is likely due to the fact that the 

validity of opinions are rarely, if ever, tested. 

Accordingly, the danger of bias always exists 
and the quota method has to be ruled out as an 
appropriate method of investigation for precise 
inquiries where unbiased results are indispen- 
Sable. 

It should not be inferred that since the pur- 
posive and quota sampling methods are types of 
judgment samples, prior knowledge and judgment 
do not enter in the design of probability samples. 
Knowledge and jud ment аге made use of ina 
number of ways, for example, in defining the 
kind and size of units of Sampling, in laying out 
homogeneous and heterogeneous areas, and in 
reduction of sampling error by classifying sam- 
pling elements into Strata in an appropriate way. 
The point is, however, that this information and 
processes are not permitted to influence the final 
Selection of the particular Sampling items that 
are to comprise the sample. The final selec- 

tions must be automatic, that is, by random pro- 
cesses, beyond the control of the investigator. It 
is only by this safeguard that the bias of selec- 
tion is eliminated and the magnitude of the sam- 
pling error measurable and controllable. 


FOOTNOTES 


1. Presidential address on 


sid “The U. N. Subcom- 
mission on Statistic 


al Sampling” at the ses- 
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sion on sampling, International Statistical In- 
stitute, Berne, September 1949. 


2. This process of sampling will never exhaust 


the population. Where a continuous mathemat- 
ical function, e.g., the normal curve is used 
to represent the observations, the effect is to 
replace a finite by an infinite population. 


3. A Million Random Digits with 100, 000 N o r mal 


Deviates (Glencoe, Ill.: Rand Corporation, the 
Free Press, 1955), 600 pp. 


4. For an illustration of the application of this the- 


orem, see: Palmer O. Johnson, Statis tical 
Methods in Research (New York: Prentice-Hall, 
1949), pp. 202-206. 


5. For an example of more complex sample de- 


signs, see Palmer O. Johnson and M. S. Rao, 
Modern Sampling Designs: Theor 


; Practice, 
and Experimentation (Minneap 


olis: University 
of Minnesota Press, 1959), 100 pp. 
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TABLES FOR TRANSMUTATION OF ORDERS 
OF MERIT INTO NORMAL EQUIVALENTS 


KENNETH E. ANDERSON 
University of Kansas 
E. L. BARNHART* 
Kansas State Teachers College 
Emporia, Kansas 


TABLES! PUBLISHED in 1954 were adapted 
from a table presented by C. L. Hull2 in 1922 for 
the purpose of changing orders of merit, or ranks, 
into normalized scores. The tables were devel- 
oped to obviate computing the percent positions of 
the individuals of a group when it was desired to 
find their normalized scores. In its original 
form, Hull's tablecontained corresponding values 
of **percent position" and normalizedscores. The 


“ 
percent position" was defined as 


100 (R - - 5) 
N 


vidual in the series 


where R is the rank of the indi 
als ranked. By 


and N is the number of individu 
means of this table, then, it was possible to pro- 
vide a set of normalized scores on a given charac- 
teristic for a group of individuals y first ranking 
them on the characteristic, then transforming e 
Tanks into percent positions by the formula, an 
finally obtaining from the table the corresponding 
normally distributed scores. 

The tables published in 1954 as adapted from 
Hull were based on a range of ranked ability arbi- 
trarily cut off ata plus and minus 2.5 standard 
deviations. The baseline of his curve was 5 stand- 
+ deviations and each of the 100 parts was eae 
у 05 standard deviations. In order peer] 
УШ Shortened range of ability, 
be were developed in terms of valents (T 
E» Ve. They contain the norm al equiv 

"eru corresponding to every r 

Sizes from 1 to 100 individuals- 


Laboratory; 


Ог Scores,” Journal of Experimental 


, 
2. C. L. нш. “Тһе Computation of Р аи 
УІ (1922), рр. 385-90. 


school of 


Formerly Assistant in Statistical manm 
1. «Tables іо 
Kenneth E. Anderson and others- / L gducation, 


find the normalized score for a givenindividual it 
is necessary only to find the table column corres- 
ponding to the number of individuals in the group 
and the table row corresponding to the rank of the 
individualinthe group. Thescore will lie at their 
intersection. For example, suppose an individual 
ranks 8th in a group of 35 persons with respect to 
a given characteristic. Locating the table column 
corresponding to ‘‘size of class” equal to 35 and 
the table row corresponding to rank in class" 
equal to 8, we find a value of 58 at their intersec- 
tion. This value is the score, out of a possible 
100, which would theoretically be made by the 8th 
ranked individual in a group of 35, if the scores 
were normally distributed. 

The top portion of Tablel gives the normal 
alents of ranks in groups of all sizes from 1 


equiv: 
to 25, where 
10 (X - M) 
Т = 50 + ———— 
t SD. 


Thus, a rank of 1 in 25 has a percent position of: 


O(R - . 
p = 10082.90 13 00 


Referring to ће unit normal curve, we obtain a 
x/c of 2.05. Thus the normalized equivalent of a 


rank of 1 in 25 is: 
T = 50 + 10(2.05) = 70. 5 or 71. 


The process used to obtain the normal equiva- 
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utation of Orders of Merit into Units of Amount 


ransm 
xxi (March 1954), pp. 247-55. 


r from Ranked Data,’ 


' Journal of Applied Psychology, 
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TABIE I 


ORDERS OF MERIT INTO NORMAL EQUIVALENTS 


Size of Class 


Rank in Class 
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TABIE II 


ORDERS OF MERIT INTO NORMAL EQUIVAIENTS 


Size of Class 
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TABLE III 


ORDERS OF MERIT INTO NORMAL EQUIVALENTS 


Size of Class 
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TABLE IV 


ORDERS OF MERIT INTO NORMAL EQUIVALENTS 


Size of Class 
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TABIE V 


ORDERS OF MERIT INTO NORMAL EQUIVAIENTS 


Size of Class 
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TABIE VI 


ORDERS OF MERIT INTO NORMAL EQUIVALENTS 


Size of Class 
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TABIE VII 


ORDERS OF MERIT INTO NORMAL EQUIVAIENTS 


Size of Class 
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TABIE VIII 


ORDERS OF MERIT INTO NORMAL EQUIVALENTS 


Size of Class 
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lents is illustrated below for a group size of 31. 


The percent position for a rank of 1 in 31 was ob- 
tained as follows: 


The same constant was subtracted from each suc- 
ceeding percent position to obtain the next percent 
position. Having obtained the percent positions 
the x/a values were obtained from a unit normal 
P= 100(1 - .5) 


= 1.612903 table, and then converted to T scores by using the 
31 following formula: 


, 


100.00 - 1.612903 = 98.387097 T = 50 + 10(x/o) 


Rank % Position x/o T Score 
The percent position for a rank of 2 in 31 could MG 
be obtained by the same process. However, the 1 98. 387097 2.14 71.4 
following constant was used to obtain the percent 2 95.161291 1.66 66.6 
position: 3 91.935485 1.40 64.0 
4 88. 709679 š . 
100(1/31) = 3.225806 5 85.483873 i T | à 
98. 387097 - 3.225806 = 95.161291 E 
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ANALYSIS OF COMPLEX CONTINGENCY DATA 


CYRIL J. HOYT, 


THE METHODS of multiple and partial corre- 
lation and regression have been used in the inves- 
tigation of numerous problems in education and 
Psychology. These methods, when used proper- 
ly, have provided research workers with an ade- 
quate means of investigating problems of the in- 
terdependence among variables that are meas- 
ured on interval or ratio scales. In many Case, 
however, the variables considered are measured 
interms of nominal scales by classifying data 
into categories. Under these circumstances, 
contingency tables areoften developed. Inmany 
instances the contingency data are classified on 
more than two bases, thus giving rise to com- 
Plex contingency tables which have dimensional- 
ity of three or more. 

" Eu II of this paper shows t 
th imum likelihood estimates : E 
at are used for testing certain hypotheses ana 
Ogous to those involved in problems of partial 
and multiple regression. Part I of this paper 
Bives a four dimensional illustrative contingency 
"wes and shows how to apply the procedure for 

esting a number of hypotheses. 


PARTI 


he derivation of 
of probabilities 


The data given in Table I consist of the the 
dud in each sub-category of 2 our ey 
sassa un of the April 1939 Status of 13, 
nesota High School! graduates ne 
i four variates “one (dexed are (a) position bY 
m in high-school graduating clas, (b) p 
igh-school status in April 1939, (c) sex, ya 
ies father’s occupational level in seven ape 
II of The statistical procedure discussed LL 
Con this paper is used to test certain hypo 
sana ng the dependence of PO 
ое S on the other three variables: t does in- 
clud not contain a complete analysi$ bu 
of the Sufficient tests to indicate the 
Р Procedure. 
nition® hypothesis analogous to the 
School Ce of the multiple regressi "e 
igna] Status on the other three variables 
high edas HI. Thus Ну may be Stih- 
Brad, School status is independent of hig! 
uation class-rank, sex and paternal o 


of June 1938. 


* 
All jcle. 
footnotes will be found at end of artic 


p. R. KRISHNAIAH, E. PAUL TORRANCE** 
University of Minnesota 


pational level. If ig designates post high-school 
status, ij, ig and i4 indicate high-school gradua- 
tion rank, sex and paternal occupational level re- 
spectively, the probability of an observation falling 
in the (ij, ig, 13, i4)th cell maybe designated as 
pijigigi4 Marginal and submarginal probabili- 


ties are designated by replacing one or more of 
theisubscripts by zeros. Thus, for example, 
Poigoo indicates the four probabilities that an ob- 
servation falls in a specified (ig = 1,2,3 or 4)cat- 
egory of post high school status. That is, the sub- 
script “0” indicates that all categories on that 
particular variable are summed. In terms of 
these symbols Hj may be designated as P jj igigig 
= Роїзоо Pi, oisi, for all 11191314 

Part II shows that the maximum likelihood esti- 
mates of these p’s are obtained by taking the ra- 
tios of the corresponding marginal totals or sub- 
totals to the total number of observations in the 
table. Thus the chi-square test statistic appro- 


priate for testing Н] is: 
Е а 
11121314 
Г s ^ 


where Pijigigi4 7 Poigoo буор 


Noig00 
Jai = 
Poig00 =F 


Ni oigig 
and озм ^ ^ n — by (1) of Part П 


for all values of ijigig andig. 


xample (computation notes are given 
the value of хі = 2838. This value is 


eted as a X^ with 123 degrees of freedom. 


t II indicates the degrees of freedom are de- 
Par y the formula (mg - 1)(m1 m3 m4 - 1) 
is case is 3[ (3)(2)(7)- 1] or 123. Thus 
W is rejected and the conclusion drawn, thatthe 
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multiple dependence of post high-school status on 
the other three variables is significant. 

Further investigation of this dependence m ay 
suggest the testing of such hypotheses as Hla and 
/or Hip. Hija concerns the dependence of post 
high-school status on high-school graduation rank 
for girls in highest paternal oc c up ational level 
while H1p concerns the dependence of post high- 
School status on paternal occupational level for 
boys in the highest third of their graduating class. 
These hypotheses are tested with ordinary two- 
way chi-square tests. Table II gives the fre- 
quency table for testing H1a and Table HI gives 
that for testing Hip. Both of these hypotheses 
are also rejected since XÍ, = 77 and X{p = 245 
where d.f. are 6 and 18 respectively. 

Other hypotheses of this typeor ofamore com- 
plex type can be tested for the further investiga- 
tion of the multiple regression of high-school sta- 
tus on theotherthree variables. See also На and 
H5 below. 

H2 is another hy pothesis which, in 5 
stances, would be of preliminary interest in study- 
ing the mutual interdependenc 
variables. That is, Hg states tha 
iables, high-school rank, post high-schoo 
Sex and paternal occupational level are m 
independent. In terms of the notation 
above 
for 


o Poooi4 


Ho: p; А 
2: Pi, = Pi1000 Poi200 Pooi3 


igigig 
all ij, ig, igs 4 
imates of Pizigizia 


The maximum likelihood est 
£ estimates 


are obtained by taking the pr oducts 0 
gor all values of 11 


$i 1 

11000 ^q (Nj, 000) н 
^ es oÍ 1 
Poiooo = i (поідоо) for all valu 2 
^ es of i3 
Pooigo = i (поо1зо) for all valu 
5 s of 14 
Poooig = i (no00i4) for all value 

Tm are obtained by using (3) of coe Hg is 

he appropriate test statistic fo = define 


again of the form X1 where the P a 
above, 
2 $ 
2 11111314 -n 
Xj-z z Z z x quim 
ij ig ig i4 п011121314 
i 12 
Tie value of XŠ for the data in Тар? ЕТ 
hich is found 45 be significant for X n of mu- 
usrees of freedom. Thus the hypothe? refuted. 
Te independence of the four Varia? determine 
аз ^ Number of degrees of freedom İS 


(хау) (опо) (тз) (m4) - (my + mo + Mg + m4)«4-1 


H3 illustrates a third type of hypothesis which 
may be of interest in some cases. Symbolic nota- 


tion of H3 is 


H3: Pi, igigig 7 Pijigoo Pooigig f^" A! 10213 


This hypothesis is stated: High-school graduation 
rank and post high-school status are independent 
of sex and paternal occupation level. Part I shows 
that the maximum likelihood estimates required 
for testing this hypothesis and gives a formula for 
xš exactly like that for Xj given above using the 


following values for Bi igigig’ 


& 1 : 
Si оза = 2 (ñiyiəoo) Mooigig) by using (4) of 
ОЗЕ “pis 12 34 Part II. 


For the data in Table I, the X3 for testing H3 has 
a value of 2420 whichis significant for 143 degrees 
of freedom. Thus, a student's high-school rank 
and post high-school status is related to his sex 
and paternal occupational level. Thegeneral form 
for the number of degrees of freedom for H3 is 


m; ma msg m4 - што - m3m4 + 1 


Hy and Hs are hypotheses of the type analogous 
to the significance of partial regression coeffi- 
cients. These are concerned with a consideration 
of the dependence of other variables while holding 
paternal occupational level constant. Thus H4 
may be stated, when paternal occupational level is 
Pli constant, high-school rank, post high schoo 
status and sex are mutually independent. Hs may 
be stated, when paternal occupational level is held 
t high-school status is in dependent 


a hypothesis more 
highest level of 


Pi, 0014 Poigoig Pooisi4 for all 
ia. W i emo 
H4: Pijigigl4 Poooi4 
HE ig, ig, i4 


The maximum likelihood estimates of Pijigigi4 


are 
Nj, ooig Poi90i. Dooisi 
— MÀ 2^4 34 Toral 
iji21314 n? D000i4 Te 
ір, ig, ig, ig 
with the above estimates of Pizigigig substituted 


in the formula given for xp the test statistic ap- 
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TABLE II 
FREQUENCY FOR EACH HIGH-SCHOOL RANK X POST HIGH- 


SCHOOL STATUS, FOR GIRLS IN HIGHEST PATERNAL 
OCCUPATIONAL LEVEL 


High-School Rank 


Lowest Middle Upper 
Post High-School Status Third Third Third 
In College 53 163 309 
In Collegiate School 7 30 17 
Employed Full-time 13 28 38 


Other 


TABLE III 
FREQUENCY FOR EACH POST HIGH SCHOOL &' 
- TATUS X PAT: 
LEVEL FOR BOYS IN Upp ‘A ING Ci ^ mE 


ER THIRD OF GRADUATING CLASS 


Post High-School Status 


1 2 3 4 
Paternal Occupational 

Level 1 256 2 10 53 
2 176 8 22 95 
3 119 10 33 257 
4 144 12 20 115 
5 42 2 7 56 

6 24 2 

2 


a 
eo 
N 

xam 
> о 
= = 
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propriate for testing H4 is obtained. 
: Di40isi4 Poiooi4 кй 
H5: Pijipigi4 AE EE E for all 11121314 
Poooi4 
The test statistic хЁ Íor testing Hs is obtained 
from X1 by replacing 


n; : Se na; Г 
z i,0igi, loiooi 
E x кыр" 191314 24 
p - "E 
ijipigi, with Pijigigi4 n ооо 


for all ij, ig, ig, ig 


For the data in Table I, X4 = 107,813 and has 
119 degrees of freedom while X5 = 1374 and has 
105 degrees of freedom. Thus, the test of H4 in- 
dicates that three other variables are not mutual- 
ly independent for fixed paternal occupational lev- 
elwhile the test of Hs indicates that post high-school 
Status depends on high school rank and sex for 
fixed paternal occupational level. 

_ The particular hypotheses which were tested 
in this illustrative example were selected with the 
aim toward showing the variety of tests that can 
be made with the X? statistic discussed in Part II. 


In any particular research problem the hypotheses 
s under inves- 


tested will depend upon the question 
tigation. Fo. аа е, а thorough study of ће 
Dartial dependence of post high-school status on 
the other variables would proceed systematically 
by testing other hypotheses than H4 and H5 by 
holding constant other variates singly, in pairs 

and in triplets. In this case, however, there 15 

abundant evidence of the dependence of post high- 
School status on all of the other variables. 


PART II 


. : al- 
The statisti rv basic to this type of an 
istical theory disti was extended 


YSis of complex contingency 
for the ings quem table Б Roy and Mitra. 2 idem 
Work below shows how this theory can be applie 

toa k-way table. The work of Roy and Mitra is М 
further used to show that the test statistic appr? 

Priate for testing each hypothesis is distribute 


25 Chi Square as i 
ymtotically. ; = 
ši Consider a multiway table when each ue 
lon represents a variate il, 12::**7* dl Let n, 


let ji. = i я 
j= 1, 2,...m; where j= 1, ^: Í 

Niji 1 ively denote the 

Tie. цо Pizig....- ik respectively 


total number of observations in all cells, ас 

О observations їп (iz... ik)th cell and the PP 

llity of an observation falling in (i1; --- ite is 

(o 30, where one or more of the subscript Hy ates 
" are) replaced by “o” in niji: ` "ik i 

ates. 

чн пі... ig are summed over those Еч Pim 


Sim; š 
Imilar explanation holds good for Pij.. 


Here niy. dijs areobserved values and Pij. zs didis 
are the values in the population. The йй)... 

seeds 
are distributed as a multinomial distribution. 


п! Dij...ik 
LL Til... ik 


where 7 denotes that the product is taken over all 


possible values of (ij... ik). 
Hypothesis of independence between “iy” 


and “ig... ik” 


_ Nijoo...0 Noig...i 
ф, ог Q under Hi~7 ae ..o) (Poig... Aj 


where т dehotes that the product is taken over all 
possible values of (ij...ik). The p's are estimat- 
ed by maximum likelihood method. The p's are es- 


timated subject to the restrictions Z1 Pijoo...0 = 
Z2 Poig..-ik = 1 where 21 and 72 respectively de- 
note that the summations run over “ií” and 


“Дә... ik". 
Now consider 


L = Z1 nij00...0 108 Pijoo...0 + 22 nois, . . ik 
log Poig...iy + *(Z1Piyoo...0 -1) +H 
(Z2 Poig...iy - 1) 


where à and u are Lagrangian multipliers. Taking 


the partial derivatives of L with respect to 
ріџоо.. .0, Poig.. ip? equating the derivatives to 


zero, and solving for pi1oo. . . o and роз... ix 
gives 


a . Rijoo...o0 

Bigos :0/ = T 
Doig.. -ik 

Poig...ik ^ ^ m 


The appropriate test statistic for testing Hj is 


uos А 2 
2 Z(ni...iy 7 n Biiig... ij) (2) 


nfi. m ik 

where £ denotes that the summation runs over all 
possible values of “1... к’ and where iiio, | i. 
= Bijoo...0 Boig. .. ig and the values of Si ioo... 
and ролу. „ік 2° given above. Xj is distributed 
as X° with (m1 - 1)(m2...my-1) degrees of free. 
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k h 
i 2 is distri 2 with (7 mi - 1) - (m mi + 
dom. The hypothesis of complete independence Хз is distributed as X w CR i Nu 
ES UO ыдан Ë mij - 2) degrees of freedom. 
" i=h+1 
Hà: Pigs ch Š Рад». :07777Pa0, «oike Hypothesis of independence between “11,12... 
where the p's are subject to the restrictions ik-1" given iy. 
E Pijoo о 5557775 Ф Poo... oi 71 H4: Big... ik = 
" = š В 
Р1оо... оір Роїдоо...оїк7777Роо...оїк-11К 
Ву following a procedure similar to that used а = 
above the maximum likelihood estimates аге ob- 00. . . oik 
Oii ni,oo...o where p's are subject to the restrictions 
^ 199... 
Pi1oo...0 = ——— _ 
P Pioo...oik = = Poiooo...ci,= 2  Pooo... 
(3) m 1 k i 01900 ik ik-1 


Oik-1ik = Pooo. . . oik 


and р Рооо...ої = 1 
Тһе appropriate test statistic for H2 is x$ Which 
is of precisely the same form as xi where By following a procedure similar to that used for 

Я + u H1 the maximum likelihood estimates are obtained: 
Diis... = Pi1oo. . . 97777Boo. , Oi, ` 


X5 is distributed asymptotically as X? with 


в в (5) 
KA mic mi*k-1 


degrees of freedom. 
The hypothesis of inde 
Sets of variates 


pendence between two 


The test statistic for H4 is X4 which has the same 


form as Х| where 
ВЗ: Pip. sik = Pis. ligo, ooo, io 1... llo Š " 
1100... oig ^777Pooo. . . oi, 4i 
when h<k where ВЫ = _ EUM 
lk Poooo. . . oik 
pis i => ; TW 
71 руд... ipoo- ..0 © 22 Pooo,, oi, |. , i=l Xj is distributed as X^ with 
and 21, Zə respectively denote that the sum ma- k-1 Kl Í 
tions run over (I usen aa ih) and («tees ci. ik). mk[ Éi "E ёл канш пшне 
d Feed а procedure similar to that used for freedom ü 
€ tollowing maximum likelihood i Ae in’ 
are obtained: ^ estimates Hypothesis of independence between “iq. ih” 
and *tig,1... ik 47 given “i. 
_ Di1...in00...0 
рі... іроо,..о= n == H5: д. „е ік 7 
(4) 
р і і ik-1i 
r —_— ifa реза зәр inoo...oiy Рооо. . *0in4 1. -- ig 11k 
9... 0ip 1. ik H Pooo...... oik 
T where p's are Subject to the constraints 
€ test statisti 
lows ae 5 fi. for H3 is XŠ which has the same Dus А = w 
1 е 1 Hi2...igoo...oij = 22 Pooo. .. oig 1. +++ ik 

* |... 100... Noo 


e "P — ; Poo...oik and Z p jl 
А — ++ Oi S k : 00... O1 
Pii.. ik === эрез lk k 


where Z and zj respectively denote that the sum - 
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mation runs over all possible values of G1- -- in) 

and (ig,1.....- ik-1). Following a procedure 

similar to that used for Hj the maximum likeli- 
hood estimates are obtained: 


2 Djj...ipoO.. -oik 
š А — — 
Piy...ipoo...oik ^ n 


nooo. . . 0ih+1: : - ik 


А Hooves OUTSET: 
Pooo...0ip,1---ik ^ n (6) 


m Поооооо. . - Oik 
Poooooo0. ..... oik 7 n 
which has the 


The test statistic for H5 is X$ 
same form as Xj where 


- Piz... iho.. -Oik DPooo...ip+1:: ik 
Bia. а = - 
Pooo. . . oik 


X$ is di 2 
5 is distributed as Х with 


degrees of freedom. 


PART III 


In order to calculate xj the computational 


formula used is: 


хрх zz Puig _]-п or 
12 11 ig ig Поідоо njyoigig 
Bg 
n [ss ees) -а 


ig Поідоо  ijigi4 11101314 


The first step іп the calculation 5 3 
Computing the ni1oiai4 sub-marginal totals in ae 
Ше I. This means th i 
Өн rank by occupational le 
ls obtained by adding over 

of post high-school status. » 
Upper left-hand corner, 87, 9, 17 and 105 M = 

А 


апа 209 ive 305 for n1012- 
are added to give of their high school 
h Consider- 


dt classification, 42 such sub-totals Ша] 
Бү де опе їог еасһ 1їпе ОЁ гее 
quiim of Table I. à 
le second step consists of © alues 
quantities, r$ 1з1314/71101344' hese v 


should be added by going down each of the four col- 
umns representing each of the post high- school 
status categories. Thus for the first post high- 
school status category one adds 


87? , 12° 9? š 
327305 +75? gg = 125-1 
for the lowest third on high-school rank. To this 


is added 


216? ‚ 159? ____ 192 , 25° 
352- *328- 338 * 307 or 460.1 
| 3 256° 36° 
ddle third апа none ee 
for the middle third and =57— * *igg OF 


952.7 for the upper third. These three numbers 
are then added together to give 1537.9 which is 
then divided by Noloo/n: 1 is the total sum for 
the whole table or 13, 968 while noloo is the num- 
per of people enrolled in college, that is, in the 
first class of post high-school status or 3945. 

The third step consists of repeating step two 
for category two of post high-school status. For 


this category one adds 


2 


2° 14 
ог 17.86 and 327 + ---- + 183 OF 21.96. These 


re added to give 45.07 which is mul- 

968 and then divided by 678. 

and fifth steps carry ona similar 

series of calculations for the third and fourth cat- 

egories of post high-school status. The total for 

category three is 133.3 and for category four is 
e divisor for category three is 


5181. 6. Th 
1232/13, 968 and that for category four of post 


high-school status is 8113/13, 968. 
From the sum of the four quantities derived in 
steps 2, 3, 4 and 5, nor 13,968 is subtracted to 


give Xj = 2838. 


utation of XŠ for testing H2 is begun 

b calculating the subtotals for each category on 

each of the three bases on which the 13, 968 gradu- 
ates have been classified. It is found that 3694 

graduated in the lowest third of their class, 5584 

in the middle third and 4690 in the upper third. 
Likewise, when the 13, 968 youth are classified 

on the variate i9 it is found that 3945 were en- 
rolled in college, that is, in the first category of 
post high-school status, 678 incategory two, 1232 

in category three and 8113incategoryfour. There 
were 6207 boys and 7761 girls. The subtotals for 

the seven occupational levels were: 1826; 2184; 
4464; 2649; 1021; 995 and 829. 

The computational formula for X$ is 


three sums а: 
tiplied by 13, 
The fourth 


The comp 


194 JOURNAL OF EXPERIMENTAL EDUCATION 


"mi 5141 
1121314 E" 
z i n ; 
ü iuis ig hij000 noiso Nooigo Poooi, 


These sums can be calculated in a number of sys- 
tematic orders. One good system is to divide the 
work into six equivalent parts by doing certain 
computations for each sex by high school rank 
category. If this system is used there are 28 sep- 
arate terms added for each of the six categories. 
For the males in the lowest high school rank cat- 
gory the following terms are calculated: 


872 3? 


172 
(1826)(3945) ` (1826)(678) * (1826)(1233) * 
105° 122 62 


(1826)(8113) * (2183)(3945) * (2184)(678) * 


1092 


202 32 
+ (829)(3945) * (829)(678) ^ ^ * (829)(8113) 


Then the sum of the above 28 
the product of 6207 and 3694, 
males and lowest high-school rank category. 


The 28 terms for the males in the middle third 
are obtained by calculating as follows: 


terms is divided by 
the subtotals for 


216° 42 142 
T1826)(3945) ` (1826)(678) * T1826)(8113) * 
1188 1592 142 


(2184)(3945) ` (2184)(678) * (829)(3945) * 


5? 132 882 
* (829)(678) * (829)(1232 + (829) (8113 


The sum of these 28 terms is divided by the pro- 
07 and 5584. 

с idee in each of the four other sixths of 
Table I are treated in a similar way. When the 
Six quotients have been obtained for the six parts 
of Table I, these quotients are added so that their 
sum may be multiplied by п. We then subtract 
n from the resultant quantity. 


For calculating X$ the following formula is 
used: 


n ni ізізід 
X = 33 3$ — = =n 
i1 19 ig i4 Mizig00 Nooigig 


The subtotals needed for the first factor in the de- 
nominator can be found by adding the twelve col- 
umns in Table I. Thesesums 578, 117, 217, 2782, 
1410, 277, 503, 3394, 1957, 284, 512, and 1937 
are nijiooo for iy = 1,2,3 and 19 = 1,2,3,4. Like- 
Wise, the fourteen subtotals need for the second 
factor in the denominator can be obtained by ad- 
ding across each of the fourteen rows of Table I. 
These sums are, 885 for the first line, 1034 for 
the second, and 1797, 1243, 450, 436, 362, 941, 
1150, 2667, 1406, 971, 559, and 467 for the other 
twelve rows. 
The calculations can be carried out by going 
down the Columns. For the first column the sum 


of 87? " 122 9? 32 is divid- 
885 10834 *1797 * ---- * 559 * 467 


ed by 578. For the second column, the same de- 
nominators are used but the divisor of the sum is 

11200 Or 117. A similar procedure is repeated 
for each of the twelve columns of Table I, using as 


divisor the one of twelve column sums that corre- 
Sponds to the column used. 
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DETERMINANTS IN MULTIVARIATE 
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IN THE computation of coefficients of multiple 
several modes 


correlation and related statistics, 

of attack are available, differing in economy, flex- 
ibility, and interpretability of intermediate steps. 
Among these are: the formula methods of Yule (20), 
which involve partial r's and partial standard de- 
viations of various orders, but which are practi- 
Cal only when the number of variates are few; and 
generalized routines for solving n simple simul- 
taneous linear equations with n unknowns, such 
d by Aitken (1), Crout (3), 
Dwyer (7,8), Horst (10, 10) gh (18), and 
t herry (19); and the meth! 
erion variance recently propose 
Which synthesizes a simplification of 


dby DuBois ,5), 


ith a matrix 


ulation of the multivariate problem w 
computing routine. This last method offers con- 
in calculation, 


Siderable flexibility and economy in 
While simultaneously providing 
tries at any stage in the routine ar 
jane or covariances of highe 
ieee peta coefficients. It 15 о 
point out relationships between conv 
лае of solving determinants and thi. 
Ovariance procedure. 
is Example I illustrates 
beeen p re plem by 
n Es with a matrix of (n + 1 
`e independent variables; o 
ея dependent variable or criterion (always 
n the extreme right-hand column) desig eh 
ach original variate is in 2” ori thm dit 
Zero and unit variance. Entries iñ the prin 


r order resi- 


ur purpose 
entional 


RENE, 


* А 
All footnotes will appear at end of article. 


uiagonal of the original and successive matrices 
are variances. Those in the remaining cells are 
covariances in z-form, which in the original ma- 
trix are numerically equivalent to r’s. (See Ex- 
ample 1). 
Successive 
one row and one 


matrices are produced, each with 
column less than the pre ceding. 
Every element in these matrices is a variance or 
covariance of residuals in higher order z-form. 
In any particular instance a residual is the origin- 
al valueless the portion associated with previous- 
ly eliminated variates. The final, single-element 
matrix is the res idual variance of the criterion 
after the portions associated with the predictor 
variates have been removed. When the residual 
variance of the criterion is subtracted from unity 
the square of the coefficient of multiple correla- 
tion is obtained. Any higher order partial covari- 
ance may be transformed into a partial correla- 
tion by dividing it by the appropriate partial stand- 
ard deviations. part correlations may be calcu- 
lated by dividing a partial covariance by a single 
partial standard deviation. The complete set of 
beta coefficients of the (n - 1)St order is obtained 
through a conventional back solution. 

This computing routine is allied with well- 
known procedures for evaluating symmetrical de- 
terminants of any order and is similarto Dwyer's 
(6) method of single division applied to symmetric 
matrices. A presentation of the mathematical 
pasis for this variance-covariance method shows 
its relationship t o formulas for multivariate cor- 
relation written in the notation of determinants. 
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EXAMPLE 1 


SUCCESSIVE MATRICES OF VARIANCES AND COVARIANCES 


OF VARIATES IN z-FORM 


Cn(n.2) 
C(n-1)(n-2) 


V(n-2) 


V(n-1).n 
A 


C(n-1)(n-2).n 
Matrix of First 


n-2)(n-1).n 
Order Residuals n 


V(n-2).n 


Po(n-1).n 


Matrix of Sec ond 


Viti-3(ns2)n 
Order Residuals I 


Во(п-2). (n-1)n 


Matrix of nth 
Order Residuals 
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Aitken (1) has demonstrated that a d i evaluated sim 
Ai eterminant of order r, a, # 0 
€ € Я , шау be imply by redu 
cing it to a determinant of order (r - 1) times a factor, thence to a determinant of = {с 3 ti » 
- imes two 


factors, etc. This is achieved by (r - 1) repetitions of the step: 


841 ize « oar 
a eg 221 
рб) _ 21 aze -aar аша ia 218 aes e .. apr- a 
Siyi 11 а, s 
азі азг азг agg = 2228 : PR 
з TENE gg din азз -2a E 
: аі я 811 š Е mo: mms 
s è В 2 аг : ES 
ari агг. . -arr are - 22 алг emg г " 
a T ris агг - ч air 
This may for convenience be expressed 
Aya b22 bes+ + «ber 
bsz bss ber 
R where Dik = аік - Si -ak 
: . ain 
bra brs brr 
en this reduction process 1$ complete 
бир where bz2 represents cell 1 of determinant 
. order (r - 1), css cell 1 of determinant or- 


p(t) = a, 122633: 
der (г - 2), etc. 
Ms р -— я qeterminant is of immediate interest in DuBois' method of reduction of 
Ocess of ev uation 


Cri i А і і і 
terion variance. -form is viewed as a determinant, the first 


d covariances іп z-ior а 
which forms the first order residuals, viz 
wy 


When the origi i {апбев: г ; 
inal matrix of Var! trix 
lower order di n imana obtained is comparable to the matri 
V Cn(n - 1) Cnín - 2) Mad 
n 
Gin- = 2)* C(n - 1)0 
V(n - 1) (n 
C(n - 2n 
pin+1) _ © 
Cin - 2) cin - 20 - 1) orl тел 
(n - 2)n , ‘ 
. к Co(n'- 2) Vo 
: con - 1) 
Con 
: itis necessary next to divide column 1 by Vn. Thus 
Following the process of reduction outlined [oHm and the other cells in the column become 8; 80 
first row © 


the 
Hp n ent in the first column, 


198 


Vn 1 Cn(n - 1) 

B(n- 1)n Vin - 1) 

Bin - 2а Cin- 2)(n - 1) 
p(n+1) = £ 

Bon` Comm - 1) 
“Уп |Vn - 1) ` (n - а Cin - 1)n Cin- Dm-2)- 


Cin- 20а = 1) 7 An- 20а 1) — Vü - 2) - Bin - засна - 9 


Co(n - 1) - PonCn(n - 1) 


= Yn | Wasika Cin - 1)(n - 2).n 
Cin = 2)(n - 1).n V(n -2)n 
Co(n - 1).n Co(n'. 2) 


This complete determin; 


ant is comparable to the triangular matrix of firs 
covariances of Example 1, 


Following the process of 


vide column 1 b; Thus the first column, 


Y Vin. ym 
column become 


B'S, such that 


= УУ. 1).n l Cin - l(n- 2).n 
ва - 2)(n - 1).n Vin - 2).n 
Boln - 1).n Соп - 2).n 
Reducing the determinant, 
= Уа = ij. Vin - 2).5 = Pin - 2)n - 1).nC(n - 1)(n - 2).n 
Са - 3)(n - 2). - Bin - 3)n - 1).nC(n - 1)(n - 2).n 
Com ~ 2). n> Poln - D).nC(n - 1)(n - 2).n 


Bn (n - 1) Sam - 2) ° i». 


Co(n - 2) - BonCn(n - 2) 
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Cn(n - 2) Cno 
C(n - 1)(n - 2) · C(n - 10 
where 
V(n - 2) Cin - 2)0 Cin 
` н Pin^Vg 
Co(n - 2) Vo 


С(а - до 7 /(n - 1)п©п0 


Сп - 2)0 - B(n - 2)nCn0 


Vo - BonCno 


* * Cin- 1)0.n 
Cin - 2)0. n 
Vo.n 


Cin- а - з). п 


Cin- оп | where ца - 1)" 
я чаа 
Vin - 1). 
Cin- 2)n -3).n Cin - 2)0.n 
Co(n - 3).n Vo.n 


Cin -2)0.n - Bin - 2) - 1). nC(n - 1)0.n 


Cin - 3)0.n - Bln - 3n - 1). nC(n - 1)0.n 


Vo.n - Boln - 1).nC(n - 1)0.n 
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Differences may be expressed in residu: 


*VnV(n-1).n | Vm -2).(@ - Dn 


C(n - 3)(n - 2. (n - 1)n 


Co(n - 2).(n - п 


angular matrix of 


This is the completion ofthe tri 
d covariances of 


2nd order residual variances an! 


Example 1. 

. Repetition of this reduction process, it is seen, 
will determine the successive higher order resi- 
duals of DuBois’ treatment and permit final ex- 
pression of the original determinant as a product 


of residual variances, namely, 


inde Vp V(n-1). nV(n-2). (n-1)n š 
Vi.23...n V0. 12...n (1) 


It is noted that use of z-Scores ensured that Vn 1- 


It is also useful to express pi as a product 


of residual variances and covariances: 
It has been demonstrated that 
pnl. VnV(n-1).n ° 

(1) 


Va. a4 ...n Vip adi 2:70:02. 0 


ўй pni by omission 


01 
Of variates of row 0, column i. Because these 
are the last two variates in the matrix, po 
1 ) is unaffected by such omission except for the 
ast two residual variances. N 
ther? examination of the determin 
Se are obtained, we note 


Now 0111 is obtained fro 


antfrom which 


рп+1 _ 
= V(n-1).n° 


CQ. 23. 


Ve.s¢...0 Vi.23.. n 


CQ1.2s. n V0.23.. P 


ES 
duct aston of row 0, column 1 affects this PP 
only at this stage so, 


ph [5 3... n 
01 = VnV(n-1).n: m ge Не (2) 
is poss? 


Equations 1 and 2 demonstrate th 
the nota. 


bl 
° to express determinants in the 


C(n - 2)(n - 3). (п - 1)n 


Vin - 3).(n - 0а 


Co(n - 3). (n - 1)n 


al variance and covariance notation. 


C(n - 2)(n - 4). (n - 1)n Cin. 2)6. (n. tn 


Cin ° 3)(n - 4). (п - 1)n Cin - 3)0. (n - m 


Co(n - 4). (n - 1)n Vo. (n - i)n 


iances and covariances of higher orders. The fol- 
lowing derivations, based upon this translation 

yield formulas for several coefficients used a 
multivariate co rrelation in both variance-covari- 
ance and determinant form. 

The notation scheme for the formulas present- 
ed below is as follows: superscripts refer to the 
composition of the determinant with reference to 
its constituent variates, and subscripts indicate 
the row and column eliminated in computing a mi- 
nor. As stated earlier, independent or predictor 
variates are designated 1, 2, 3,...n, and the de- 
pendent variate or criterion is designated “0”. 
The total number of variates may be indicated for 
greater clarity as (n+0), rather than (n +1). Any 
sub-grouping of (n + 0) will be designated “д”. A 
major determinant is designated Dn+0. If the i-th 
row and the j-th column of pn+0 are crossed out 


the remaining determinant, D is called a first 


minor.** Afirst minor of thetype pi is а princi- 
pal first minor. Crossing out two rows and two gol- 
umns results ina determinant indicated as pit jj 

Formulas may be written in determinant form 
for the partial variance, partial covariance, betas, 
multiple R, partial г, and varieties of multiple- 
partial and multiple-part correlation. 

It is convenient to express Equation 1 as 


p?*9 = VnV(n-1). nV(n-2). (1-1) * 
Vi.23...nV0..2--.n (1a) 


and similarly, from Equation 1, we may write 


um = VnV(n-1). nV(n-2). (n-1)n ° 
Vi.23...n (3) 


Then, dividing (1a) by (3) we obtain: 


рп*0 
М0. 12...02 = DE (4) 


in which VO. 12... n is the residual variance of 0, 
after variance associated with variates 1, 2, 3... n 


s been removed. 
From Equation 2 it is also seen that 


200 JOURNAL OF EXPERIMENTAL EDUCATION 


DD? = VnV(n-1).n- > + Va.34,..nC01.2s...n 


(2a) 


If a principal minor is obtained by crossing out 
two rows and two columns from рп+ rather than 


one row and one column, the resultant expression 
is analogous to (1a), so that 


0 
Dh 00 = VnV(n-1). nV(n-2). (n-1)n. 


V2 .34...n (5) 
Then, dividing (3) by (5) we obtain 


n+0 
D 
01 
CO.2s...n= -IRÉ (6) 
01100 
Which is the partial covariance between variates 
0 and 1, after Variance associated With variates 
B. Аат has been removed. 
The beta coefficient is defined as the ratio of a 


partial covariance to a partial variance of the 
Same order; in notation 


2l (1) 
jq Ура 
оа Cii.q is always equivalent to Сі}. q: There- 
, 


801,280, = Coses, un 


1.23...n (8) 
By analogy from Equation 4, 


ҺЕ ҮЛҮЛ р (9) 


ог 


I 
m 
= 


V 
0.28... pmo (9a) 


It follows from (6), (8), 


The coefficient of partial correlation, TQi.23. Ta 
may be expressed in terms of partial varianc 
and covariances as 


C0:.23...n 


(13) 
V Vi.2s...n V V0,2s,..n 


By substitution from (6), (9) and (9a) we obtain, 


T01.23...n 


(14) 


A related statistic, the coefficient of part cor- 
relation, (sometimes called ‘‘semi-partial r), 
is useful for relating a dependent variate, 0, А to 
Some independent variate, 1, after variance in 1 
associated with one or more additional or control 
variates has been removed from 1, but not, of 
Course, from 0. For example, in contrast to par- 
tial correlation Where we are correlating two res- 
iduals, 20,24... p with 2:1,23,,.p, in part corre- 
lation we are correlating 2) With z,,53,,. n. The 
formula for part correlation may be written 


C01.23...n (15) 
¥ Vi.23...n 


TOGi.23...n) = 


Substituting from (6) and ( 


9) we obtain 


Fo(t. aS... 


and then Simplifying, 


"0(1.23,. n) = Tp 00) 


or analogously, from (6) and (9a), 


п+0 
Do1 


r - 
1(0.23...n) nrw 
11,00 Ti 


In some instanc 
the variance assoc 
q, from th 


(1 6a) 


WRIGHT - MANNING - DUBOIS 2 
01 


V9. i2 
R(0. q@)[ 1. q, 2. a, . .. (n-a). a] oem 
.q 


(17) 


inin determinant form 


Again substituting, weobta 
elation: 


the equation for the multiple-partial corr 


R 
(0. q)[ 1.q,2.q,...(n-q)- a] ° 


(18) 
If the q variates are partialled from the (n- 9 
0, we may 


rr с but not from the criterion, 

m-q) an РНБ multiple-part correlation between 

ifi л modified independent variates and an unmod- 
ed criterion. This coefficient is written 


V0.q - Vg. 12, ..n 


R 
Of 1.q,2.q,...(n-q).q] * 
(19) 


In determinant form this multiple-pa r t correla- 


tion is 
pq+0 pn+0 


Ri | 
: = 3 (20) 
[1.9,2.9,... (п-а).а] 7 E E 

в oi the type үер 


wenversely, the multiple-part 

an "s by to obtain the correlat 

Crite “ified independent variates an а 

Sim i (0.q), may also be cemonstré 

as су, the criterion (0. 4.) is repr 
5 where 


ion between (п-а 
да resiaua 
ten. For 


esent ed 


Z 
0.q 
0.а=0'= с 
а 90.4 
Ther, 
R —w edi 
(0. g)[12...(n-g] = * i- Vo 12... (n-g) 
(21) 
In q + , ow 
ua. аш form, by analogy from (фен 
ра+0' (22) 
Vo.i2...(0-9 ^ peo 
0' 0' 


Th 
en, by substituting, 


R9. q[ 12. .. (n-a)] 


From the earliest years of multivariate corr 
lation áeterminants have been used as a means of 
expressing pertinent numerical operations. Pear 
son (14) explicitly discusses the use of determin N 
ants in an article published in 1903. Except for 
minor differences in the notational scheme, our 
formulas for multiple and partial correlation and 
for regression coefficients are the same as those 
presented presented by a number of authors (9,12 
13). From the practical point of view it is appar- 
ent that if one begins with a matrix of correlations 
the computation of these several coefficients by 
solving determinants with a desk calculator is un- 
economical. The variance-covariance formulation 
is more direct and permits retention of statistical 
meaning at every step in the computations. Onthe 
other hand, where electronic computer programs 
have already been developeu for evaluating deter- 
minants, there is ample justification for using the 
determinantal formulas as an alternate approach. 


FOOTNOTES 


* Prepareu in part under Contract Nonr (816 (02) 
en the Office of Naval Research and Wash- 


betwe! 
ington University. Opinions expressed are 
those of the authors and are not to be construed 


as representing the endorsement of the Depart- 
ment of the Navy. 


aluating minors of the type Dij itis custom- 
ary to take into account the sign attached to the 
ij-th position; in this instancethe signed minor 
may be referred to as the cofactor Ajj The 
sign of the ij-th position may be determinea by 
a formula such as that described by Kelley (13): 


**In ev 


Aj (C1) 68) Di; 


However, the format usea to evaluate first mi- 
nors by the variance-covariance procedure is 
such that in the arrangement of the matrix, the 
i-thand j-th variates are always adjacent at 
the extreme right of the table. Hence, the 
quantity (-1)0+) is invariably negative. 
Throughout this paper we have preferredto pre- 
sent formulas in the notation Djj rather than 
Aij otherwise the sign attached to the ij-th po- 


; 
sition must be taken into account. 
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AN EXPERIMENTAL EVALUATION OF T 
DIFFERENT PROGRAMS OF TEACHING. | 


HEALTH IN THE 


SIXTH GRADE AND 


THE ADMINISTRATIVE IMPLI- 


CATIONS 


INVOLVED 


ARTHUR M. JENSEN 


Tuttle Element! 


tary Demonstration School 


Minneapolis, Minnesota 


Problem and Its Significance 


Шы IMPORTANCE of problems coníronting 
the administrator and his staff is determined by 
for seins pesa of measures of effectiveness 
M e particular situation unde r consideration. 
espa: the urgency with which a staff views 
in need for study of whatis happening in the 
ау situation, will largely determine the ex- 
re of participation and involvement. The study 
later eee here is an examination of how such prob- 
enn can be effectively WO rked out ina school. 
tases in health education have had a very active 
pro est in studying the nature of the instruction 
uses by which they might pest achieve the ob- 
the ves of health education. If one is to evaluate 
hh effectiveness of any program which is ga 
ts to improve on existing Pros? s or if two 
the Ore new programs аге under consideration, 
ur method which can give the most precise mea 

ement is that of the modern experiment. 
en Study which we shall consider here 18 a 
mera ei Ir experiment to determine 
functio of two programs which h 
methods, To make a comparison betw 
ine th S, this investigation was designe nr 
educ € achievement toward the objectives of Ph 
Schock in the sixth grade in the elemen 

ols. 


Distinct; 
Distinctive Features of Treatments 


ploy? treatments of curricul 
ing th in this study. The proce 
tives pol ng consisted of organ 
assur health education in а mann 
е the children that ће C 


dure £0 


9 P 
qis seeing Health in the S 
rtati — 
Bem University 0 


** 
P. 
almer O. Johnson. 


sixth grade would be taught with the best possible 
instruction by either treatment. One treatment of 
the content employed the unit organization of sub- 
ject matter and included teacher-pupil planning 
techniques and the problem-solving approach to 
learning. The other treatment followed the com- 
monly used method of integrating the curricular 
content for health education into the basic sub- 
ject areas, such as social studies, science, read- 
ing, arithmetic, etc., toge ther with topical ar- 
rangement of content where integration did not ap- 
pear to be feasible. These two treatments of sub- 
ject matter were carefully described and followed 
by both teachers during the two-year periodof the 
experiment. Each teacher kept a log of activities 
for the first year. These logs were exchanged the 
second year to guide each teacher inthe use of the 


alternate treatment. 


The Situation for the Experiment 


was primarily concern ed 


with the pupils in the sixth grade at the Tuttle Ele- 
mentary School which is a regular public school 
operated under the rules and regulations of the 
Minneapolis Board of Education. It is a school in 
a middle-class neighborhood in an older partof 
the city. Since it is in the proximity of the Uni- 
versity of Minnesota, it frequently provides oppor- 
tunities for students to observe demonstrations in 
teaching techniques. The enrollment ranges from 
550 to 600 in grades ki ndergarten through six. 
Records indicate that the average ability of the chil- 
dren over several years was between 98 I. Q. and 
102 I. Q. There were 15 regular classroom teach- 
ers employed at the time of this study. The prin- 
cipal serves two schools, this one three days per 


This experiment 


aluation of Two Different Programs 


erimental EV - 
r M. Jensen, Ап Ezp Tetrative Tmplications Involved, unpublished Ph.D. 
e Johnson and Dr. Otto E. Domian, Co- 


Dr. palmer ©. 


ew York: Prentice Hall, 1949), pp. 109-202. 
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week and a nearby school two days per week. 
There are part-time certificatedpersonnel which 
include a nurse, a visitingteacher, a physical ed- 
ucation teacher, and a speech teacher. Consult- 
ants are available on call in art, physical educa- 
tion, music, science, and general curriculum. 
In addition, there are three custodians and a full- 
time clerk serving theschool. Aschool situation 
Such as described would represent any of a num- 
ber of schools in Minneapolis which are com par- 
able in size and where the conditions may approx- 
imate those found in this school and community. 
This does not, however, permit occ asion to gen- 
eralize the results beyond the evidence found in 
the situation except by implication. 


Population and Sample 


The population studied is that of an aggregate 
of all sixth-grade pupils in the Tuttle School over 
an undefined period of time. Basically, the as- 
sumption was that this population will persist un- 
til such time that the socio-economic and other 
related factors change the constituents ofthe pop- 
ulation. The assumption made was that the class- 
es for the particular years used in this study could 
have come from this population. A study of the 
means and variances of the population over a four- 
year period of time from 1953 through 1957 showed 
that the classes werecomparableat the 5 percent 
level of significance with respect to intelligence 
quotients. This was determined by testing the 
homogeneity of the variances and the equality of 
the means. The L, test was used to test the dif- 
ferences between the variances and the analysis of 
variance was usedtotestthe differences between 
themeans. Thesamples ог classes usedin the ex- 
periment were shown to be representative of this 
population which was comprised of the sixth-grade 
pupils for the years, 1955-56 and 1956-57. A total 
of about 60 pupils was selected for each year com- 
prising a total of about 120 cases. In the final out- 
come, atotal of 96 pupils was used due to lateen- 
trants anddropouts. There were four classes used, 


two їп апу givenyear. Assignment to these classes 
was made at random. 


Experimental Design 


From the Standpoint of exp 
we may desi 
domization. 
classes, eac 


erimental designs, 
gnate this study as unrestricted ran- 
The pupils were assi gned to the 
helt yas taught by a different 

à ethod of selection to classes pro- 
bed i Whereby every pupil in the sixth 
eed 195 years 1955 through 1957, at Tuttle 
ae х А Lan equal chance of being in a particu- 
teres with a particular program which we have 
ed to as treatments. Each year of the two- 

0 students were ran- 


the other treatment was the standard one using 
random-sampling numbers. Briefly, this consist- 
ed of giving every sixth-grade pupil com prising 
the experiment a number and then entering a table 
of random-sampling numbers in the accepted man- 
ner and placing each pupil into one or the other 
sixth-grade class. ** A coin flip determined the 
teaching treatment for each class, the second class 
received the alternative treatment with the restric- 
tion that during the second year of the experiment 
the teachers were assignedto the alternative treat- 
ment. 

Preliminary to the three-way analysis of the 
factors of teacher, treatment, ability level and 
their interactions, the absolute gains in achieve- 
ment were analyzed. Tnis was done by the use of 
the “t'-test for the significance of the difference 
between the means on the pre- and post-tests. 
Likewise, the variances were used for testing the 
significance of the difference in variability on the 
pre- and post-tests. The appropriate ‘‘t’’-tests 
for correlated data were used in both cases. 

The design was a 3 x 2 x 2arrangement of data 
involving teacher, treatment and student ability 
level. The analysis of variance and covariance 
were utilized in the analysis of the data. Both the 
absolute gains and the relative gains in achieve- 
ment were studied in the analysis of the data. 

The inventories of attitudes were studied pri- 
marily for the changes in attitudes that children 
may have undergone from pre- to post-testing and 
to determine whether or not the children respond- 
ed differently from the teachers' expected respon- 
ses. The Chi-squaretest of independence was ap- 
plied for testing out differences in attitudes be- 
tween treatments which may have existed. The re- 
sults of the inventory of attitudes administered to 
parents were treated in a similar manner. 


Evaluation Instruments 


One achievement test for health knowledge con- 
structed by the writer and a commercial test were 
used to measure achievement. Careful attention 
was given to item analysis, defining validity, test- 
ing out reliability, and preliminary testing for the 
constructed test. In addition, two inventories of 
attitudes were used for the purposes discussed 
previously. 


Instructional Methods 


An entire year’s work based on the health cur- 
riculum for the sixth grade in the Minneapolis Pub- 
lic Schools was planned and taught by two regular 
teachers. The curriculum was based upon os 
scope and sequence of the content found in an еж 
amination of the children's textbooks on health an 
the health education guide furnished by the schools: 
together with research recommendations. ere 
procedure resulted in organizing the content in 


m 


—————— 
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nine broad categories of health knowledge and at- 
titudes for identification. À basic assumption 

was made that the experimental class which used 

the unit arrangement of subject matter and teach- 
ing technique would include the same year’s cur- 
riculum content as the control class. This class 

had the content integrated into other subject areas 

and was taught by the method generally used in 

the schools. 

The general plan for unit development was uti- 
lized by the experimental group th roug hout the 
Study. A well-established procedure followed by 
the schools was recommended. This plan is often 
analyzed into several stages of development, for 
example: introduction, exploration, problem-set- 
ling, problem-solving, summarizing and evaluat- 
ing. Each unit that was selected was developed 
by teacher-pupil planning techniques. The con- 
trasting or control method was primarily that of 
integrating health education into the basic areas 
of the curriculum. This is frequently supplement- 
€d by topical arrangement of material not easily 
integrated. It likewise includes timely topics and 
teachable moments when events or experiences 
arise that need immediate attention. However, 
experience has demonstrated that mutually exclu- 
Sive methods of teaching are hard to find. A ser- 
lous attempt was made by the participating teach- 
ers to constantly evaluate their work in terms of 
the type of teaching that they were doing in order 
not to introduce bias into the experiment. 


Analysis of the Experimental Data 


An over-all description of the experimental re- 
Sults for the Health Knowledge and Attitude Test 
18 observable in Table І. This summary table for 

° means and standard deviations shows that the 
results оп the means followed a uniform pattern, 
at is, the means increased between the pre- and 
Post-tests. This same table shows that a reduc- 
10 in the standard deviations oc c urred between 
con Pre- and post-tests. The main analysis = 
m cerned with the comparison between tre E 
medi. An analysis of the experimental data € 

T first to determine if there had beena sign 
diffe, growth under each of the treatments. ° 

°тепсе between the means on the pre-test ane 
can pst lest was tested for statistical s a 
Sor by the application of the appropriate anm 
Brito Significance for correlated data. bes 
ence L t” -test on the significance of the = 
Pre between the variances was also made on 

~ апа post-tests. ats 
icanc € basic data and results of the test E be. == 
Test | for the Health Knowledge and At ште, 
the „Ге found in Table П. Table H reveals siss- 
es Sixth-grade students in each of the four сч = 

Sàn Te consistent in making gains. The ЕС e 
ац. Performance ranged from 8.79 to 12. 
àins were statistically significant. 


The next comparison of pre-test and post-test 
performance scores involved the variances. Table 
HI contains the initial and final variances and the 
test of significance. 

It is observed from Table III that inthreeof the 
four classes there was a reduction in vari ability 
within classes. 

The second and main analysis concerns the de- 
termination of the differential effects of the con- 
trasting treatments. Using the three-way classifi- 
cation system for organization of the data provid- 
ed an analysis giving meaningful interpretation re- 
garding treatments, teachers and mental ability. 
This also made it possible to study the several in- 
teraction effects. Shown in Table IV are the var- 


‘ious effects set forth and the number of degrees of 


freedom assessed to each. In addition, this table 
shows the interactions that were studied. 

This tabular arrangement illustrates in concise 
form the basis for testing out the various hypothe- 
ses of the effects of teacher, treatment, ability 
and interactions. 

The analysis of the means on the Health Know- 
ledge and Attitudes Test was accomplished by the 
analysis of variance technique. There were signif- 
icant differences am ong the three I. Q. groups as 
shown in Table V. This motivatedthe study of the 
adjusted means. 

The relationship between ability and achieve- 
ment for three of the four classes On initial testing 
indicated that the means of the highest I. Q. level 
were above those of the middle I. Q. level and the 
means of the middle I. Q. level were higher than 
the means of the lowest I. Q. level. The exception 
occurred with the integrated group with Teacher 
(1) where the initial mean was higher for the mid- 
dle I.Q. group than for the highest I. Q. group. 
This situation was occasion to re-analyze the final 
test scores by adjusting for the initial differences 
by the analysis of covariance. Table VI contains 
this analysis. 

An examination of Table VI shows that no signif- 
icant differences were found between the mean 
achievement under the two treatments, nor were 
any of the interactions found to be statistically sig- 
nificant. There was, however, asignificant differ- 
ence among the means of the students of different 


levels of ability. 


Investigation of Health Attitudes 


The Chi-square test of significance of independ- 
ence of change in responses to items from pre- 
to post-testing Was utilized for the student and par- 
ent inventories. Pupil responses were compared 
to responses selected by the teacher by use of the 
Chi-square test of independence. The pre- and 
post-responses by treatment were also compared. 
The results on the inventories of attitudes which 


were specifically prepared for the study indicated 


that: 
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TABLE I 


MEANS AND STANDARD DEVIATIONS FOR THE FOUR CLASSES BY TEACHER 
AND TREATMENT FOR THE HEALTH KNOWLEDGE AND ATTITUDE TEST 


Teacher (1) Teacher (2) 
Method Method 
Integrated Unit Integrated Unit 
Means 
Pre 77.46 72.37 79.21 73.95 
Post 87.29 84.91 88.00 84.58 
Difference 9.83 12.54 8.79 10. 63 
Standard Deviations 
Pre 14.74 19.09 13.05 18. 68 
Post 10.34 12. 96 13. 66 17.09 
Difference 4.40 6.03 .61* .59 
* The only Standard Deviation that increased. All the others decreased. 
TABLE II 
TEST OF THE SIGNIFIC ANCE OF THE MEAN GAIN FOR EACH CLASS 
ON THE HEALTH KNOWLEDGE AND ATTITUDES TEST 
mac Standard 
Difference Error of 
m ë Between Difference 
ear row M i 
p eans in Means t D.F. Conclude* 
grate 9.83 1.75 5.61 23 S 
First Unit 10.6 
3 1.70 6.25 23 S 
Second Integrated 8.7 
e 9 2.31 3.80 23 S 
Second Unit 12.54 
2.64 4.75 23 S 


*8 = Significant gain (. 05 level); all Significant at . 001 level 
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TABLE Ш 


VARIANCES AT THE BEGINNING AND END OF THE SIX 
TH 
SIGNIFICANCE OF THEIR DIFFERENCE FOR HEALTH ee 
AND ATTITUDES TEST 


== 
—— 


= 
Year Group S? pre s? post Difference t D.F. Conclude* 
First Integrated 217.30 107.08 -110.22 -2.98 22 8 
First Unit 349.26 292.25 - 57.01 ..94 22 NS 
Second Integrated 170. 35 190. 87 20. 52 .35 22 NS 
Second Unit 364.77 168.17 196. 60 -2.76 . 22 S 


шш шш т ш" M o 


* NS = Not significant. 5 = Significant. 


TABLE IV 


souRCES OF VARIATION AND DEGREES OF FREEDOM 


Source of Variation Degrees of Freedom 
1 


Treatments 
Teachers 1 
Ability levels (I. Q.) 2 


Treatment X Teacher * 


2 

Treatment X I. Q. 
2 

Teacher x I. Q. 

Method x Teacher X I.Q. 2 
84 

Residual or error 
95 


Totar her interaction is confounded with year 


¥ The treatment X teac 
differences- 
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TABLE V 


MEAN SCORES ON THE HEALTH KNOWLEDGE AND ATTITUDES 
TEST FOR EACH OF THE I.Q. GROUPS AT THE END OF 
THE EXPERIMENT 


I.Q. Group Mean Scores 

Upper third 93.66 

Middle third 88.97 

Lower third 75.97 
TABLE VI 


ANALYSIS OF MEANS ON HEALTH KNOWLEDGE AND ATTITUDES TEST AT 
END OF EXPERIMENT ADJUSTING FOR INITIAL SCORES 


Sums of 

Source of Variation D.F. pe acted d F Conclude* 
— ————rrrrFY = eee 
Treatment T .30 .30 < 1.00 NS 
Teacher 1 13.30 13.30 < 1.00 NS 
LQ. 2 634.18 317.09 4.32 S 
Treatment х Teacher 1 5.13 5.13 < 1.00 NS 
Treatment x I. Q. 2 23.18 11.59 < 1.00 NS 
Teacher x I. Q. 2 77.94 38. 97 < 1.00 NS 
Treatment X Teacher XI.Q. 2 42.92 21.46 <1. 
Residual 83 6,096. 41 73. 45 š š; 

Total 94 


* NS = Not significant. 8 = Significant. 
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1. Changes in responses to items on the Health 
Attitude Inventory were significant for the 
second year of the experiment at the 5 per- 
cent level. 


2. When comparison was made of sel ected 
items by treatment effects, only one item 
was significant at the 5 percent level. 


3. The results were not significantly different 
when the pupil responses were compared to 
those deemed desirable by the teachers. 


4. Pre- and post-responses compared by treat- 
ment were found to be statistically signifi- 


cant. 


5. When responses from pre to postonthe par- 
ents' Inventory of Health Practices of Chil- 
dren were studied, there was a change in 
attitudes at the end of the study as com- 
pared to those at the beginning. There 
was also a statistically significant differ- 
ence between treatments. 


Summary 

—K 

Р The main conclusions based оп the evidence 
` presented in the reported study are: 


achievement of the 
tion showed a sig- 
for each treat- 


1. The effects in promoting 
Objectives of health educa 
nificant gain on the means 
ment. 


treatments did not 


2. A comparison of the two 
fferences 


reveal statistically significant di 
among the classes studied. 


3. There were no statistically significant dif- 
ferences shown between teachers. The con- 
clusion drawn is that the teac hers were 
equally effective in using the integratedan 


unit arrangements. 


4. Achievement among the 1. Q. groups = 
Significantly; that is, the superior studen : 
Surpassed the average and the average su 
passed the inferior. 

5. As far the observable outcomes WE astaq 
Cerned the experimental evi 
no significant results with resp 
ences between teachers and between 
ments and their interactions. 


t some Sign 


drend 
the parents. 


* Evidence did indicate tha : 
Changes in attitudes among chil 
Place. This was also noted for 


Implications for Administration 


One purpose advanced for this study pertained 
to the use of the results for administrative consid- 
erations regarding health education. There were 
several things that may be impliedfrom the results. 
In the first place, it was shown that the involvement 
of teachers in a long study resulted in a real chal- 
lenge and desire on the part of the teachers to ex- 
amine their purposes and practices more objec- 
tively. A second outcome that became apparent 
as the study progressed was the increased inter- 
est the teachers had inthe procurement of supplies, 
equipment, and related teaching materials for both 
treatments. Since the results regarding superior- 
ity of one treatment compared to another were not 
conclusive, it would not be expedientto adopt one 
or the other without more objective evidence. Re- 
sults from this experiment may suggest some as- 
surance that either the unit treatment or the inte- 
grated treatment could be employed without loss 
to the students in health education. As long as 
either method is taught at its best, the adminis- 
trator could feel a degree of confidence that the ob- 
jectives are being fulfilled. Athird implication 
that may be inferred from the study relative to the 
cost for the treatments is thatthereis not enough 
evidence to show which treatment would be less 
expensive than the other if both achieve the same 
objectives satisfactorily. 

There are so many possible variations by both 
treatments that the evidence suggests a possible 
combination of both treatments used in this study. 
This appears to hold the promise of utilizing unit 
arrangement for those phases of health education 
that can be organized in that manner and also em- 
ploying an integrated approach to materials that do 
not apply to any organized pattern. 

A general implication that may assist in the ad- 
ministration of the health program resulting from 
the examination of the data, is that neither teach- 
method or ability levels operate in isolation. It 
is the composite effects ofthese and other influ- 
ences that encourage learning in health education 
in meeting the psychological, sociological, physi- 
ological, pedagogical and democratic objectives 


of education. 


er, 


Implications for Further Research in 
Health Education 
Hate 


In the area of health education this study should 
have some meaning for future research. It may 
suggest that further investigation would follow up 
in greater detail, the observations of the behavior 
of children ina health education program conduct- 
ed under contrasting treatments. Also, it may 

at the factor of retention of information, 


est th 
ия and other related learning experiences 
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would present a challenging inquiry, particularly study in health education. These and other possi- 
in the normal behavior of health. À follow-up ble directions for future research also suggest 
Study of what pupils say they do comparedto their that the instruments for evaluation need to be fur- 
Observed reactions may be implied as a likely ther developed to serve these purposes. 
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THE RELATIONSHIP OF STUDY HABITS 


AND OTHER MEASURES TO ACHIEVE- 
MENT IN NINTH-GRADE 
GENERAL SCIENCE 


DANIEL P. NORTON 
Hibbing High School 
Hibbing, Minnesota 


Problem 


UNDER THE apparent assumption that certain 
mechanical procedures are significant contribu- 
tors to achievement in the various fields of learn- 
ing, alarge amount of effort has been directed 
toward identifying the procedures which correlate 
most highly with achievement. It was the purpose 
Of this study to re-define studiousness in terms 
which would permit application of multiple regres- 
Sion analysis to the question, ‘‘Does achievement 
үз ninth-grade general science relate more close- 
ү, to study habits than intelligence, reading abil- 

y, and aptitudes?'" 


Population and Sample 


During the school year 1957-58 five general 


Science sections in the High School Building of = 
d ibbing, Minnesota, schools were available for 


aily observation and constituted the population 
dures were fol- 


beyond those 
From 


(red general science, and whose р 
8 programs had been identical. 


M r Р 
easuring Instruments and Techniques 
x indep endent vari- 


The study i d si à 
y incorporated 51 endent vari- 


ables and ind 
one t. The indep 
ables were: dependen 


Iowa Silent Reading xi 
Wa Algebra Aptitude 
Otis Qui : X3 
St Quick Scoring X 
Piera Rating x 
st ; 
Structor Rating Xe 


ifferential Aptitudes 
Verbal Reasoning 
bstract Reasoning 


Space Relations 
Mechanical Reasoning 


The first three measurements had been secured 
previously; the raw scores were used. Student 
Rating of study habits and application was secured 
near the conclusion of the course. Each student 
in the sample sections was rated by five other stu- 
dents assigned to the same section in accordance 
with a rating scale and instructions developed by 
the instructor. The problems involved in develop- 
ment of a useful Student Rating were (a) selection 
of an appropriate scale, (b) securing an ‘‘honest’’ 
rating quickly, thereby pre venting consideration 
of friendships or knowledge of achievement, either 
of which might have influenced results, and (c) min- 
imizing the complications of effort involved by 
identifying the smallest number of such ratings 
necessary for each student to acquire a meaning- 
ful composite. 

Each student rated and was rated five times. 
Ratings were conducted by random assignment 
within sections and followed an accurately timed 
schedule that provided fifteen seconds to make the 
first rating requested by the somewhat unfamiliar 
scale. Thirty seconds were allowed to complete 


the other four ratings by the form which follows: 

4 BEST By the scale at 
the left, I rate 

3 the study habits 
and application 

2 AVERAGE of 

1 Jane Smith 

0 POOREST as r 


The five ratings each student received were 
med in Table I. Reliability of the method was 
untested except by inspection. A consistency of 
rating seemed apparent. While it would have 
Been desirable to have no summ ed ratings of 
twenty; adequacy of rating sample size seems as- 


sum 
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TABLE I 


DISTRIBUTION OF SUM OF STUDENT RATINGS AND INSTRUC TOR 
RATINGS BY SEX 


Student Instructor 
Sum of a 
Ratings Boys Girls Boys Girls 
20 1 1 1 2 
19 2 5 - 4 
18 - 7 1 8 
17 1 1 1 4 
16 1 6 2 3 
15 1 1 4 3 
14 2 1 1 3 
13 2 6 1 5 
12 5 3 2 4 
11 8 5 6 3 
10 3 2 5 2 
р 3 2 1 1 
1 5 3 3 3 
6 2 4 3 2 
5 2 3 1 1 
1 1 4 1 
4 2 2 2 4 
4 - = 2 
2 - z 
1 й - - = 
0 = = - 


| 
| 
| 
| 


Тоїа1 


r3 
[n 
сл 
©з 
rs 
[n 


й 


NORTON 


TABLE П 


STATISTICAL SUMMARY FOR STEP ONE BY SEX 


Statistic Boys Girls 
N 41. 53. 
X, 154.4146 157.2642 
X5 55. 8293 57.4528 
X3 53.3415 53. 1136 
X4 10. 8293 12. 6981 
X5 10. 0000 13. 2642 
X6 49. 6585 50. 2311 
Y 55. 5610 51.1868 
ry . 5814 ‚5927 
Ed .3915 . 6448 
E . 5533 . 7061 
Y ` 5994 . 6304 
Ed ` 3048 . 3761 
y . 7638 
rey ‚5595 
‚5402 . 5690 
112 . 7030 . 7477 
13 . 4012 . 6142 
114 12820 .4188 
RS ‚5195 - 6403 
16 . 6664 50117 
r23 ` 3740 . 1278 
124 ‚4415 ‚5157 
125 ‚4539 «5981 
aoe . 4889 . 6399 
34 2239 = 
735 ‚ 5153 d 
r36 5173 ‚1995 
r45 2081 oo 
746 14490 «3888 
56 
2 143. 14878349 135. 77503628 
1 
: 102. 14512195 149. 48331000 
s 
2 
Ç 63. 13048785 88. 75544267 
s 
3 
- 14, 19512195 21. 90711176 
s 
4 
А 20. 35000000 22. 65965418 
s 
5 13.3481 
8 ga, 55236200 34819485 
s 
6 127. 96371917 
2 101. 85248113 
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sured. Š : 

The same numerical values were utilized in In- 
Structor Rating as Student Rating. Five suc h 
ratings were made atapproximately monthly inter- 
vals, one on each of the days of the school week. 
Times selected were made to coincide with major 
tests so that each student would be visible for en- 
hanced recall of behavior, and interruption would 
be least apt to occur. Instructor ratings differed 
most from Student Ratings in that they were meant 
to evaluate not what might be called an **impres- 
Sion", but an accumulation of evidence and im- 
pressions. Notes were made in the c1 ass grade 
book when a student (a) did not complete an assign- 
ment, (b) had not promptly begun an assignment 
when it was made, (c) seemed to daydream, etc. 
These notations were reviewed when ratings were 
made. 

As was the case with Student Ratings, the dis- 
tribution for boys is somewhat normal to inspec 
tion; that for the girls is bimodal. There is also 
a noticeable skewness to the upper range for 
which a possible explanation is not hard to find. 
It was the unsolicited Opinion of most instructors 
to this class that they were unus ually studious. 
This was purportedly most true of the girls. The 
author concurs, 

The Differential Aptitude Tests were adminis- 
tered at the start of the school year. The batter- 
ies used were selected for probable lack of over- 
lapping with other measu res Z-scores were 


summed. Raw scores were used forthe first 
three measures and achievement. 


and .824, respectively, 


Split-half reliability was calc 
and even items. 


tive rel iabilities 


ulated for odd 
For boys the absolute and rela- 
were 4.021 and .77 


i , res - 
tively; for girls they were 3, 607 and , 917. е 
Analysis 

Analysis procee 


ded s 
girls and followd th eparately for boys and 


1. Palmer 9. Johnson. Statis 
2. The number of i 
d 
Variables involved. Susi 


arried seems an absolute minimum for the method used and th 


calculated. 2 Zero order correlations between the 
independent variables and dependent vari able s 
were all positive, the lowestfor each group being 
T5y (Table II). " c 
In step two, Fisher's auxiliary statistics (gij)'S 
were calculated for the six systems of simultane- 
ous equations, after which it was possible to com- 


pute Ry, 123456, the multiple correlation between 
Y and X1, X5, e, X6. 
For step three, define: 


Bi-Zgijryy (i, j=1,...,6) 
J 


where Bj is the standard partial regression coeffi- 
cient. 


Define again: 


Ry. 123456 = р Biriy ({=1,...,6) 


The standard partial regression coefficients 
are recorded in Table III and the multiple correla- 
tions in Table IV. The latter may be referred to 
as very high indicating a Strong relationship to be 
present. 

In step four the significance of R .123456 was 
tested by means of the variance ratío, which in 


this case is a ratio of the mean square associated 
with regression. 


2 
F (variance ratio) = = e R D , 


Where m, is the number of degrees of freedom a s- 


Sociated with regression (in this case six), and 
Other symbols are as used previously. The re- 
sults were statistically significant (Table V). 


The significance of each Bj was tested in step 
five. Define: 


pa 
- C- Ву 123456) Ei 


LN ae 1 


where Sp, is the standard error of Bj. The test 


of significance of each Bj is given by: 


t wo 
В sg 
i SB 
withN-m-1 degrees of freedom. а 
For both sexes, Bg, calculated from Differen 


tial Aptitude Test data, was significant at the one 


tical Methods in Research (New York: Prentice-Hall, Inc., 1949). 


e number 0f 


‚ ЖШШЕ = 


NORTON 


TABLE HI 


STANDARD PARTIAL REGRESSION COEFFICIENTS (Bj)’s BY SEX 


By Bg B3 B4 B5 Bg 
Boys . 2581 .0496 -.1990 . 6220 -. 3027 . 5237 
Girls -. 0300 . 2017 . 2524 . 0342 . 0189 . 4526 


TABLE IV 


MULTIPLE CORRELATION BETWEEN THE DEPENDENT 
VARIABLE (Y) AND THE INDEPENDENT VARIABLES 
(Xj)'s BY SEX 


————  H€——— — — — Án —— € cà 


Ry. 123456 By. 123456 
Boys . 63294503 .7956 
cuin . 66562245 .8159 
TABLE V 


ILITIES ASSOCIATED WITH THE OBSERVED 
PROBABIL! VARIANCE RATIOS BY SEX 


Probability 


Variance 
Ratio (F) (P) 
<.01 
en 9.7696 
15.2600 <.01 
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TABLE VI 


PROBABILITIES ASSOCIATED WI 
COEFFICIENTS (Bi)'s AS DERIVED 


BY ''t"- TEST 
eae 
———— MÀ 

Bi tp, Probability tpi Probability 
Bi 1.710 -10 >p >.05 -0. 221 -90 >p> .80 
Bo 0.317 -80 >р >. 90 1.542 20 5p» .10 
Вз -0. 971 -40 >р >. 30 1.675 -20 5p» .10 
B4 4.172 -001 >p 0.187 .90 >p>.80 
Bs -1. 927 -10 >p >.05 0.141 .90 >p> .80 
Bg 3. 436 -01 >p ».001 3. 350 


TABLE VII 
PROBABILITIES ASSOCIATED WITH THE DIFFERENCE BE- 
TWEEN STANDARD PARTIAL REGRESSION COEFFI- 
CIENTS FOR BOYS AND GIRLS AS DERIVED 
BY ''t"- TEST 
1B4- B5 Probability 
Boys 6. 624 .001 > p 
Girls 0.110 


р>.90 


ТН THE STANDARD PARTIAL REGRESSION 


-01 >p> .001 


———— S ШАА 


= 
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mds level; for the boys, B4, Student Rating, 
E Significant at the one-tenth percent level. 
нти Bi's were *'large" but not enough to permit 
el Beate of Significance at the five percent lev- 
be Sually considered minimal. The patterns for 
ys and girls were noticeably different (Table VD). 
"E. terminal tests were then made. First, it 
Nat Onsidered valuable to know whether Student 
ly pics and Instructor Ratings differed significant- 
esis m each other. Stated in the null, the hypoth- 
a зау be symbolized: Но: B4 = B5, for each 
b alternate: B4> B5). А ''t" test may again 
° made in which: 


dd e o „.., 
Res! 844 - 2845-855 ' 
I R 
SRes = N-m-1 


Т ; 

sni, iilis indicate a highly significant dilfer- 
The fe the ratings for the boys only (Table VII). 
boys, YPothesis is accepted for girls, rejected for 


of ae the proportions of the total variance 
by the tin On the achievement test accounted Шү 
been fanno combination X1, X2,.. X6, has 
Birlg nd to be 0. 633 and 0.666 for boys mi 
the Dres pectiyely. With only B6 significant а 
tions Weer bed level for girls no further calci a- 
sample ne made. For boys, however, wit is 
taken E forty-one, Bj, B4, B5 and B6 may bi 
Additions: nificant to a greater or lesser SPP" 


Calculations were elected for t 
indepen he centage of association of each of io 
Pendent "t Variables with the variance of t iy dm 
Standar, Variable is given by the square "hair 
Sum mao Partial regression coefficients. а 
iance -3Y be treated as the total (100%) of the V 


T accounted t of four factors. 
he for by the se Qe respective 


facto. O portion b 

Ог, accounted for by in 
reveals ау be easily determined by diya ce 
and 3 Percent associated with ge To 


Scores Percent with Differential Ap 


© 
tclusions 


Lour. 
as his investigation did not fin 


еа 
aSSocja d esed by Instructor Rating, 
Sal seras 


d study habits, 


à en- 

АГАН зсіе tt achievement in ninth-grade ay 

"d aptitag e than intelligence, read int Rating, 
"des. When measured by Studen 


it was more closely associated for boys. 

2. As rated by other students, the study habits 
of boys was a statistically significant predictor of 
Science achievement; as rated by the instructor 
their study habits neared significance negatively. 

3. The difference between Instructor Rating 
and Student Rating of study habits of boys was sig- 
nificant beyond the one percent level;the differ- 
ence was not significant for girls. 

4. Aptitudes, as measured by the Differential 
Aptitude Tests, were the most significant predic- 
to for both sexes considered together. 

5. Instructor Rating appeared less valuable for 
predictive purposes than any other independent 


variable. 


Summary and Recommendations 
Summary апо UU 


The study required development of a technique 
for measuring study habits or application of stu- 
dents and subsequent multiple regression analysis 
of six independent variables and one dependent var- 
iable. Particular weaknesses were the doubtful 
nature of what students actually meant by their ra- 
tings and limited precision of Instructor Rating. 
They were offset to an extent by the fact that cor- 
relations secured by Student Rating were as high 
or higher than those customarily found by inven- 


tory methods. Also important is the frequent 


agreement with results obtained in previous re- 


search. | 
while the more mechanical aspects of learning 


are easiest to note, this does not preclude that 
they may be more extraneous than basic. Learn- 
ed more from the less tangible fac- 


ing may proce д é 
w such as attitude and aptitude than is common- 
ed. If so, the researcher might do well 


ly conced : 
to review his mental sets with respect to study 
s he should investigate the thought 


its. Perhap 
n s involved more thoroughly. 


pe difference exists between the learning 

rocesses of girls and boys ? Surelythey may dif- 
pre a significant manner. if research to date 
fer in ssed by and failed to identify the factor or 
has = more basic to science achievement, addi- 
poo effort should be expended on that behalf. 
tion т, if the achievement patterns of boys are 
Pra ' underestimated as a result of “study 
typical Considerations, care should be exercised 
habit ir favor. This would be particularly true in 
in n departments where aptitude test scores 

vent predict achievement better than has been 
mi 


real ized. 


be 
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STATE LIMITATIONS ON LOCAL PUBLIC 


SCHOOL EXPENDITURES IN THE 
UNITED STATES 


HARRY E. HULS 
South Dakota State College 
Brookings, South Dakota 


Statement of the Problem 


THE PROBLEM of state limitations on local 
public school expenditures in the United States 
arises from the fact that these limitations curb 
School facilities expansion and limit educational 
Offerings. Therefore, it becomes necessary to 
determine which limitations are desirable and 
Which are not, and to develop principles to guide 
the amending and forming of laws limiting these 
expenditures. 

The major objectives of this study are (1) to 
establish the items of public school expenditures 
Which should be limited by law, (2) to develop 
Some principles which should be used to guide 
the amending of old and the formation of new laws 
Which limit local public school expenditures, and 
(3) to apply these findings to current Minnesota 
laws limiting public school expenditures. 


Method of Procedure 

i For the purpose of developing principles, the 
iterature on the subject of these limitations Wet 
feviewed. The following points lee ed 
ermined as representing the thinking of VES 


mitations were con- 


1. General expenditure li 
local govern- 


Sidered to be a function of the 
ment rather than of the state. 
Tax limitations were opposed. 
Debt limitations were favored. 
Interest rate limitations Were opposed. 
Budgetary laws were considered desirable. 
Statutory limitations were favoredover con- 
Stitutional limitations. 


Poen 


A Next, the laws of the forty-eight sta tes ape 
ite embled for the purpose of determining al ге, 
ems limited by law. These were listed anG © 
den in a questionnaire along with principles 
hose origination is described below. “Bd: 
it rinciples concerning limitations on ун 
ures were developed based on two С piter 

"heir formation: (1) actual practice in the majo 


ity or States, and (2) from theory as represented 


inliterature, or both. These principles, and the 
items of limitation found in the state laws, were 
submitted for evaluation to two panels of experts. 
These two panels were composed of: 


1. All the state departments of education of the 
forty-eight states. 

2. All those professors of educational adminis- 
tration who met at the convention of the Na - 
tional Conference of Professors of Educa- 
tional Administration at the University of 
Connecticut between August 21 and 27, 1955. 


In this questionnaire, the statement was made 
that each item listed should not be limited by law. 
Both the eight principles and these items of limi- 
tation so stated were evaluated by the panels by 
checking the catego ries *agree'', “agree with 
reservations", and ‘‘disagree’’. 

Eighty-seven percent (72 people) of the profes- 
sors of educational administration and ninety-one 
percent (46 states) state departments of education 
completed the questionnaire. These responses 
were treated statistically to determine (1) that 
one of the three categories hada greater response 
than any of the other two categories for the item 
or principle under consideration, (2) whether this 
particular arrangement of answers in categories 
was-different from chance or theoretical frequen- 
cy at the 5 percent level of significance, and (3) 
that the largest item alone contributed essential- 
ly to the deviation from chance arrangement at 
the 5 percent level. Chance as referred to in 
number 2 above is interpreted as meaning the 
number of responses which would have occurred 
in any one category if the total number of re- 
sponses had been divided evenly among the three 
categories. Table I shows for principle one, the 
fact that the total response was s ixty-six and 
therefore the chance response for any one of the 
three categories was twenty-two. 

Since these data were categoricalin nature 
that is, enumerative and could not be ranked o E 
set in any order, the chi square test of signifi- 
cance was used. The formula for testing wheth- 
er the arrangement of responses was signifi- 


220 JOURNAL OF EXPERIMENTAL EDUCATION 


cantly different from chance is: 


Chi square = | И 
Z(observedírequency - theoretical Írequency) 
theoretical frequency 


An illustration of how this formula was applied 
Will be shown in the analysis of the responses 
of the professors of educational administration to 
principle one (Table I). 

The chi square table shows this figure of 40. 727 
with two degrees of freedom to occur with a prob- 
ability of less than .001. Therefore, at the pre- 
viously selected 5 percent level, the null hypothe- 
sis that the arrangement of responses on principle 
one came about by chance, can be rejected. This 
same type of analysis was used throughout for the 
responses of the two panels of experts to both the 
principles and the items of limitation. 

The next step in the statistical analysis of these 
data was the testing of whether or notthe largest 
response category contributed essentially to this 
deviation from chance arrangement, also at the 5 
percent level of significance. This could be Solved 
directly from the 2 x 2 table where: 


a - largest response 


b = the theoretical 
category 


or chance dis- 
tribution of the 
largest response 
category 


С = the total respon- 
Ses in the other 
two categories 


d = the sum of the 
Chance distribu- 
tions of theother 
two categories 


and the formula for direct solution of chi square 
is: 


Chi square = (ad - be)? (a+ b « c + d) 
а + 0)(с + d)(a + c bid 


Since there is a total number (N) 


of responses 
the theoretical distribution of each à 


category is 

(N/3). Also, since there is the same total Dos 
ber (N), the two remaining observed frequencies 
ж) ү r a €sented by (N - the Category to be 
ested) or - a). Therefore the followi - 
tionships exist: , USE 

a=a 

b = N/3 

c=N-a 

d = 2N/3 
Substituting these for P, ©, 


2N(3a - N)? 
Chi square - За + NON - За 


This is of value іп facilitating calculations іп that 
the fundamental numbers can be calculated ped 
and directly for but three expressions: 2N, 3a, an 
5N. im. н 

The use of this formula for chi square is illus 
trated in Table II, using the responses to perme 
ple one by the professors of educational adminis- 
tration. š 

Since “a” represents the responses in the 
“agree” category, and the chi square 14.667 with 
one degree of freedom occurs by chance with a 
probability of less than 5 percent, it can be as- 
sumed that the “agree” category contributed most 
to the difference from chance shown in the first 
analysis. This analysis was used for the response 
to the principles and items of limitation as given 
by both the professors of educational administra- 
tion and the state departments of education. 

A further analysis was used to make a statisti- 
cal cross comparison between the responses of the 
two panels of experts used. It was necessary to 
determine whether or not the two distributions of 
responses to the three categories were in propor- 
tion or disproportion. The chi square test was 
made of the statistical hypothesis that these two 
panels may be regarded as having come from the 
Same population with respect to their judgment of 
the principle or item under analysis. It is helpful 
in accepting one of the response categories to be 
able to show that the di s tribution of responses of 
both panels on the item are in agreement with each 
other. If it were found that the chi square was not 
Significant at the 5 percent level, then the state- 
ment could be made that the two distributions were 
not disproportionate. 


The formula for this chi square with two de- 
grees of freedom is: 


š 1 
chi square = — (xap. n 
q r ( p) 


Hlustrative analysis of principle one using this 
formula is shown in Table III. 

Since the chi square of 3.184 is not significant 
at the 5 percent level, the hypothesis that these 
two panels are from the same population with re- 
Spect to this item cannot be rejected. 


Method of Choosing Items from the Ratings 
of Experts 
IÍLpens 


The followin, 


8 possibilities actually occur red 
and the accept 


ance of them was outlined as follows: 


iL. Accept the item as stated where i 
а. Both groups responded significantly in 


HULS 


TABLE I 


ANALYSIS OF PRINCIPLE ONE USING CHI SQUARE WITH TWO DEGREES 
OF FREEDOM 


Responses Ío - ft (fo - ft)? 
Agree 46 6-22= 24 576 
Agree with reser- 

vations 6 6 - 22 = -16 256 
Disagree 14 14-22=- 8 64 
Total Responses 66 a 

Z(fo - ft = 89 
Chance Responses 22 
Z(fo - ft)” „ 896 _ 40.727 


Chi square = = и 


CALCULATION OF CHI SQUAR 


3a-N (3a-N)? 3a«N 5N-3a Chisquare 


Item N 2N 


Principle One 66 132 


Chi square - 2N(3a - Ny’ 


Ga + N)GN - За) = 


> 


5N 
330 


1 


a 


44 


TABLE II 


3a 
132 


66 


13208397 = 14.607 


Е USING DERIVED FORMULA 


4356 


198 


198 


14.667 
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the ‘‘agree’’ category, and were from the 
same population. 

b. One group responded significantly in the 
*agree" category and the other did not 
respondsignificantly in any category and 
were from the same population. 

2. Accept the converse of the item where 

a. Both groups responded significantly in 

the ‘‘disagree’’ category and were from 


the same population. 
3. Use writer's own judgment based on what- 


ever other information is available 

a. Where neither group responded signifi- 
cantly in any category. 

b. Where both groups responded significant- 
ly in one category but were found to be 
from divergent populations. 

4. Recognize t h e impossibility of analysis in 
this plan where 

a. Both groups responded significantly, but 
one responded in the “agree?” category 
and the other in the **disagree" cate- 
gory, when either from the sameor from 
divergent populations. 


all Applying these above criteria for acceptance, 
9f the eight principles could be accepted with 
wi exception of principles two, three, and four, 
at o no significance was found in the responses 
we @ five percent level. These three principles 
im restated using the writer's own judgment 
ed on all available information. | 
ех, Беш the same criteria, all the limitations on 
of Penditures could be analyzed with the exception 
of the limitation on expenditures for maintenance 
а. ће student away from home, school paid pen- 
m and salaries and expenses of officers of 
Chool building corporations. 
der p Principles and items as finally stated un- 
these conditions were: 


1. Total debt should be limited, rather than 
warrants, bonds, etc., being limited sep- 
arately. 

2. The only limit ontotal expe 
be a budget voted upon by the people is : 
representative body of the people Ше з 
for that purpose, with dedicated fund s 

з. ing allocated as dedicated. 

* Transfers within funds of the bu 
Saly be limited for transfers ou 
ed or capital outlay moneys. 

4. үз н ерус: finie of the budget 
Should be limited by statelawfor as 
Out of dedicated or capital funds whi ° 
other transfers should be limited by boar 

5 action only. : е 

* Interest should not be limited as 


nditures should 


dget should 
t of dedicat- 


o rate or 
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amount, but should be arrived at through 
bids. 

6. Purchases above a set limit sh ould be ob- 
tained by bids. 

7. Any limitations on expenditures or borrow- 
ing should be statutory rather than constitu- 
tional. 

8. No exceptions should be made to the laws 
limiting expenditures or bonds to suit cer- 
tain districts, but if exceptions are neces- 
sary the entire law should be revised to fit 
all districts in the state. 


Limitations 


1. Salaries and expenses should be limited by 
law for: 
a. District school board members. 
b. County school board members. 
2. Salaries and expenses should not be limited 
by law for: 
a. Assistant county superintendent. 
b. County superintendent or supervisor. 
c. County truant officer. 
3. The following should be limited by law: 
a. Long term debt (bonds, etc. ). 
b. Short term debt (warrants, etc.). 
c. Total debt. 
4. The following should not be limited by law: 
a. Tax levies. 


b. Interest. 
5. The following general expenditures should 


not be limited by law: 
Purchases in general. 
Election costs. 
Publishing. 
Expenditures for real estate. 
Membership dues to organizations. 
Premiums on surety bonds. 
Total expenditures. 
Mileage. 

i. Library board salaries. 

6. Miscellaneous items: 
a. Tuitions should be limited to actual cost 
of education per pupil. 

(The following items were not included because of 
the diametrically opposite points of view of the 


two panels of experts): 
b. Maintenance of students away from home. 


c. School paid pensions. 
(The following item was not included because it 
was rated inconclusively by the panels and there 
was little evidence to guide intelligent decision 
on its part): 
d. Salaries and expenses of officers of 
school building corporations. 


FRO Boop 


Minnesota laws were found to differ from these 
findings in that only four groups out of the sixteen 
limitations existing in Minnesota were found to 


conform. . They were: 
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Salaries and expenses of board members. 
Debt limitations. 


Contracts and purchases in general. 
Tuition. 


њ CON D 


Of these, only the tuition laws compl etely con- 
formed to the applicable principles. In addition, 


Minnesota was found to lack any substantial budg- 
et law limiting expenditures. 


Recommendations 


In general, all states should examine their 
laws in the light of these findings and apply these 
principles, repealing those laws which should not 
exist and modifying those which are not in con- 
formity with the principles. 

Minnesota, in particular, should repeal all but 
the four classes of laws which are acceptable in 
terms of the findings, modify all of the remaining 
limitations except the tuitionlaws, and add a budg- 
et law in keeping with the principles outlined here. 


ж 
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AN EXPERIMENTAL EVALUATION OF 
DENT.CENTERED METHOD AND A PAD 
ER-CENTERED METHOD OF BIOLOGICAL 

SCIENCE INSTRUCTION FOR GENERAL 
EDUCATION OF COLLEGE STUDENTS 


KENNETH V. OLSON 
Northern State Teachers College 
Aberdeen, South Dakota 


Nn OF two methods of teaching biological 
Баб О non-science majors, pr eparing for 
ance on NE COMES. will result in better perform- 
call x d measures of abilities: (1) to re- 
2) to u apply biological facts and principles, and 
ic я some of the inductive aspects of scientif- 
а rece ing? This was the problem investigatedin 
ther Ha quantitative study designed to shed fur- 
lem "in on a few aspects of the perennial prob- 
if teaching methods. 1 * 
the o Rn of what objectives are 
marized i, teacher's efforts were succi 
to K in four pivotal ones including, а 
€nneth E. Anderson:2 


s: qe Quisition of factual infor 
* Understanding and applicati 
ples of science. 
Understanding and applicati 
of the scientific method toge 
а, Glated attitudes, and 
` Skill in the basic tools pec 
ic science. 


дае a as the general educatio 
are basi concerned, all of these, 
ing mu on the assumption that the UT u 

use of scientific procedures in meeting 


eds ; 
ds in the basic aspects of living (also known as 


Prop) 
tion em-solving) is the most significant соп 


lion aa elencee instruction. Many critics oÍ € = 
per»? hel expressed much concern for the pe р 
in ite 0 of our young people у a e 
ЫЗ achi ce. However, the future of ae 
Tee levements depends, to an unprece en e 
Чеп? Upon a scientifically literate citizenry i 


° А uct 
att is uy. Such conditions justifies T а 
entra vestigation and, therefore, Ti pt 


importance as one part in 


are worthy of 
nctly sum- 
ccording 


mation in science. 
on of the princi- 


on of the elements 


3. 
ther with its asso- 


uliar to à specif- 


Louis Heil 
he un derstand- 


*ay 
ticle. 


foo : 
tnotes will be found at end of ar 


educational picture. 
From acritical analysis of both quantitative 


and qualitative methodological studies in the liter- 
ature of general education (mainly science educa- 
tion) we may conclude, tentatively, that as far as 


1. The acquisition of factual information or 
knowledge, the conventional teacher- 
controlled or “traditional” m ethods were 
superior; 

2. Outcomes extending beyond those of factual 
information, such as the ability to apply sci- 
entific principles, to interpret data, and to 
draw conclusions from data, the evidence is 
not clear-cut as regards different instruc- 
tional procedures (probably, in the major- 
ity of the studies, there seemed to be evi- 
dence in support of those methods that tried 
to provide learning situations designed for 
the direct attainment of these purposes); 

3. The development of scientific attitudes, 
these were more likely to be attained through 
such methodologies as recognize and pro- 
vide for their direct development rather 
than as concomitant learnings. 


As aresult of his experiences with schools, teach- 
ers, and teaching, plus a review of theliterature 
in science education, the author (investigator) be- 
came concerned with the problem of how to im- 
prove his ability to present Science in ways that 
not only meet the students’ needs but also make it 
alively and interesting subject. 

Accordingly, the question of how to teach to- 
wards the attainment of such objectives was inves- 
d in this experiment by comparing two teach- 


tigate i 
ing methods representing conflicting educational 
philosophies- These are often referred to as tra- 
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ditional, authoritarian, or teacher-centered, and 
progressive, permissive, or student-centered. 
Various aspects of theteaching-learning situation 
were, therefore, operationally defined. The role 
of the instructor and of the students in the deter- 
тіпаііол сї objectives, the selection of content, the 
planning of class activities, the evaluation of 
learning, and other related aspects of the teach- 
ing methods were differentiated to provide a con- 
trast in the two approaches. In general, the 
teacher-centered method was characterized as 
authoritarian; the teacher not only directed his 
own activities in preparing and presenting the sub- 
ject matter, but also the students’ activities in as- 
signing the outside readings and other written 
work which he required of each individual, By 
contrast, the student-centered method was char- 
acterized as permissive; the teacher shared re- 
Sponsibilities with the class, and through cooper- 
ative teacher-student planning made provision for 
all aspects of the teaching-learning situation. 

In the teacher-centered method, the classroom 
procedure consisted of lectures and d emonstra- 
tions given entirely by the instructor. In the stu- 
dent-centered method, the Students, fol 
operative teacher-student planning ses 
veloped the content of the course by un 
then selected topics or problems of p 
terest to them and workedon these in sm 
Considerable use was made of audio- 
Such as biological preparations, char 
films, etc.; these were Supplied auto 
the teacher-centered groups, but wer 
able for use by small groups withi 
centered method groups, 
one textbook was used in 
method groups while six 
Sixth of a section having 
wide variety of referenc 
the student-centered me 


lowing co- 
Sions, de- 
its. They 
ersonal in- 
all groups. 
visual aids 
ts, models, 
matically to 
e made avail- 
n the student- 
if they so desired. Only 
the teacher -centered 
basic textbooks (one- 
any one textbook) plus a 


ematerials were used by 
thod groups. 


the students Selected th 
entist whose work t 


ng procedures? 
he draw? 


Were they justifiable? 


Other questions related to the scientific meth- 
od of solving problems were alsolisted. Each 
small group evaluated theirown report as well as 
being graded by the rest of the class. 

In the Content groups (teacher-c entered) each 
student was required to draw up a written report 
in which an example of an unscientific approach 
was contrasted with an example of a scientific meth- 
оа of approach to solving problems. Many refer- 
ences to examples of both types were given to the 
students and also a guiae sheet on the Basic Assump- 
tions of the Scientist, which in essence, was an out- 
Tine of scientific method andattitude.9 The students 
were to pick from their examples instances in which 
the scientific method(s) were or were not used and 
incorporate these into their written report. These 
were graded on an individual basis by the instructor. 

The validity of any conclusions drawn from an 
experiment and the generalizability of inferences 
resulting from analysis of data demand that the 
questions to which we seek answers be framed in 
mathematical terms as the testing of statistical 
hypotheses. A modern self-contained experiment 
should utilize, as fully as possible, the principles 
of randomization, replication, and the use of local 
controls in its design.6 For these reasons it is 
important, in conducting such a study, that the no- 
vitiate avail himself of the counsel and guidance 
of an experienced research director, particularly 
in the early Stages of planning the experiment. 
Further, the design and analysis of an experiment 
Should be planned as complementary parts of the 
total investigation and appropriate for the data at 
hand. Otherwise we unnecessarily subject our- 
Selves to repeating the errors of others and of 
conducting a so-called ‘post-mortem’? analysis. 

In order to gain familiarity and experience 
with the experimental or Student-centered method 
of teaching and to focus on the details which differ- 
entiated it from the control or teacher-centered 
method, a pilot study was conducted with a 1959 
Summer session class. This, along with the ex- 
perience gained in teaching and evaluating previ- 
ous years’ classes, enabled the investigator to 
conduct the crucial experiment with the students 
who registered for the Winter and Spring quarters 
of the 1955-56 academic year at Northern State 
Teachers College. The population consisted of ps 
non-science majors attending the college over a > 
undefined period of time, The samples of the poP 
ulation investigated were formed from the tota 
number registering for the Winter and Spring ачат 
ters. These 105 students were randomly assigne” 
to one of four Sections at the beginning of the en 
periment by sequentially numbering the student 
1, 2, 3, or 4 as they appeared to register and eA 
bitrarily placing them in corresponding section i- 
Certain necessary exceptions werekept to a min 
mum. Replication was provided for by having tW 
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Sections taught by the control method and the oth- 
er two sections taught by the experimental method. 
The form of control was that of unrestricted ran- 
domization in which the same instructor attempt- 
ed to teach both methods with equal zeal to dupli- 
cate groups of similar subjects. 

The author's Survey of Science course isa 
year long sequence beginning in the Fall quarter 
with an introduction to the physical sciences which 
includes units in astronomy, physics and chemis- 
try. During the Winter andSpring quarters selec- 
ted areas of the biological sciences are taken up. 
Scope and Sequence are indicated in the following 
Outline of units. 


Unit I: How do Scientists Solve Problems? (A un- 
it on the scientific method of problem solving) 

Unit п: What are Some of the Basic Characteris- 
tics Common to all Living Things? (A unit on 
the cell concept, cell structure, protoplasm, 
theories as to the origin of life, characteris- 
tics of life, and photosynthesis.) 

Unit ш: What are the Form and Function of Our 
Bodies? (A unit dealing with the skeleton and 
muscles, nutrition and diet, digestion, circu- 
lation, respiration and excretion. ) 

Unit IV: What is the Biological Basis of Behav- 
ior? (A unit on the nervous and endocrine SyS- 
oe with emphasis on learning and mental 
health. ) 

Unit V: How is the Continuation of Life Forms En- 
Sured? (A unit on plant, animal, and human 
reproduction and sex with emphasis on prob- 
ees of adolescence, young adulthood, and 
‘amily living, Е 

Vait VE How a the Characteristics of Libi 

hings Transmitted? (A unit on the Манго 
basis of heredity including the concept o[ evolu- 
thee of life forms. ) 
УП: What Can We Do to Imp 
and Public Health? (A unit i 
‘aj and social health measures 

Sait УШ: How Can Man Help Preserve a 
In Nature? (A uniton ecology and cons 
With emphasis on local problems.) 


rove Personal 
n which modern 
are stud- 


Balance 
ervation 


tea grouse of lack of time the final unit was omit 
m the course. s cri- 
te, Pree evaluation instruments were s Detween 
ias measures to assess the difference "^ so 
ing Ча! and final status of the students Sithe con- 

€ objectives of instruction under sts for 
ел ШЕ treatments. These inclu M um ply bio- 
logio ring (1) the ability to recall апо gh use 
‘sop l facts and principles, (2) the api inking, 
“ОЁ the inductive aspects in scienti | rehen- 
° (3) knowledge of vocabulary and sepretation 
in pi of reading passages and their 116. 


Masti 


ven as an 
оць By. The last mentioned test, P ology Test. 
€criterion, waa the Cooperative 


The subject matter tests for evaluating stu- 
dents’ achievement in ability to recall and apply 
biological facts and principles for both Winter and 
Spring quarters were devised by the author. These 
were built as equvalent parallel forms from the re- 
sponses of similar students to previous adminis- 
trations of the test items. Both sets were built 
from items analyzed after the method of Davis. 
Reliability coefficients were determined for each 
administration by either the Maximum Likelihood 
estimate method for alternate forms or by the 
split-half method with application of the Spearman- 
Brown Prophecy Formula for single forms. 8 The 
former method, as applied to the Winter quarter 
(174B) pre-test results, gave reliability coeffici- 
ents ranging from 0.54 to 0.75. Using the latter 
method with alternate forms given as post-tests 
resulted in reliability coefficients ranging from 
0.75 t0 0.88. Whenthe same form, originally 
given as a pre-test, was administered as a re- 
test two months after completion of Winter quar- 
ter instruction, the reliability coefficients were 
foundto range between 0. 66 and 0.88. TheSpring 
quarter (174C) pre-test results gave reliability co- 
efficients between 0. 59 and 0. 74, while those of 
alternate forms, given as a final, ranged between 
0.68 and 0.86. Therefore these tests were found 
to possess a reas onably high degree of internal 
consistency. P 

Various aspects of problem-solving abilities 
were measured by using Mary A. H. Burmester’s 
Ability to Think Scientifically Test IA. 9 As pre- 
viously noted the Cooperative Biology Test, Forms 
X and Y, was used as an outside criterion test. 

Other tests used as sources of information on 
students’ backgrounds included the Otis Quick 
Scoring Self Administering Test of Mental Ability, 
the Cooperative English Test Part C2, Reading 
Comprehension, and the Cooperative G eneral 
Achievement Test, Test HI, Mathematics. — — 

In the analysis of the data descriptive statistics 
were used, initially, to present test results ob- 
tained under thecontrasting treatments (see Table 
D. The significance of the differences between 
the means of pre-tests and post-tests, which were 
correlated measures, Was determined by using ap- 
propriate t-tests. After testing the assumptions 
underlying the pooling of data, the results of the 
replicated teacher-centered method group samples 
were combined, as also were the results of the 
replicated student-centered method group samples, 
in order to provide additional degrees of freedom 
for estimating the error component. Statistical in- 
ferences were then drawn on the basis of various 
tests of significance for several specific null hy- 
potheses accompanying three different statistical 
analyses. These included: (1) a two-way analysis 
of variance (two treatments by three levels of in- 
telligence), (2) F and t-tests for determining the 
significance of differences between the variances 
and means of boys and girls in and between the con- 
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TABLE I 


RANGE, MEAN, STANDARD DEVIATION AND MEAN GAIN SCORE ON INITIAL AND FINAL AD- 
MINISTRATION OF CRITERION TESTS ACHIEVED IN SAMPLES OF EACH METHOD AND 
IN EACH METHOD'S SAMPLES COMBINED 


Standard Mean Gain 
Cri- Range Mean Deviation Score 
ter- Sample ———ÀÀÀ a ———— - 
ion Treat- Pre- Final Pre- Final Pre- Final Final- 
Test ment? N test test test test test test Pre-test 
174B 2X 23 12-49 8-62 26.57 28. 65 10.37 13.852 2.08? 
Wtr. 6X 25 16-49 22-59 28.48 36.24 9.16 10.29 7.76 
Qtr Exp. 45b 12-49 8-62 27.36 32.62 9.96 12.85! 5.26 
Con. 51 11-54 19-69 27.02 36. 04 9.60 13.31 9.02 
3C 30 3-47 9-69 23.17 31. 53 10.56 14. 02? 8.36 
5C 27 15-54 19-65 32.26 41.22 7.34 11.142 8.96 
174B 2X 18 8-62 14-58 30.06 32.56 14. 72 12.99 2. 50? 
Re- 6X 23 22-59 12-62 31.04 34.78 9.79 14. 38! - 2.26 
test Exp. 41 8-62 12-62 33.98 33.80 12. 53 13. 66 - 0.18 
Con. 46 19-69 6-64 37.48 36.76 13.01 11.76 = (0; 72 
3C 23 22-69 22-64 32.65 35.30 13.00 9,22! 2. 65? 
5C 23 19-65 6-60 42.30 38.22 11.38 13.90 - 4.083 
174C 2X 23 13-48 18-58 30.74 35.53 10.01 10. 68 4.78 
Spr. 6X 24 19-56 23-68 33.29 44.08 8.90 12.59 10.79 
Qtr. Exp. 46 13-56 18-68 31.91 39.72 9.51 12.43 7.81 
Соп. 52 15-53 21-73 30.73 42.12 9.80 12.08 11.39 
3C 31 16-53 21-68 32.19 38.45 10.44 11.45 6.26 
5C 23 15-49 30-73 30.09 47.96 9.61 10. 98 17.87 
y E 18 35-73 33-84 50.61 59. 28 12.14 16.46 8.67 
o Th. X 23 30-80 40-91 52.83 61.30 13. 63 14.28 8.47 
Sci. Exp. 41 30-80 33-91 51.85 60.41 12.88 15.11 8.56 
Con. 46 32-77 38-89 50-61 59.15 13.14 13.97 8.54 
ча 23 32-85 38-89 48.70 59.48 13.66 14.67 10.78 
C 23 33-77 40-86 52.52 58. 83 12.61 13.56 6.31 
SR A 18 Nm 5-40 — 37.56 25.11 14.16 11.55 -12.45 
Exp. 38 l. 0 7-49 36.76 26. 33 11.68 13.37 -10. 43 
a ee ae 5-49 37.17 25.77 12.71 11.86 -11.36 
ae m -66 8-56 34.11 27.00 12.10 10.18 - 7.11 
15-61 13-45 36.48 28.48 10.22 8.26 - 8.00 
?* Number indic i 
mental (Exp. Lb wid pe and letter X or C following designates these as experi- 
umber i i А 
ае m combined group may differ from sum of method samples due to dropouts or incom- 


"Hypothesis of equal i 
> E. qual variances between pre-test and post-test is in the region of doubt (.05>F 
Hypothesis of e i 
qual varianc i 
„other cases it is acceptos a pre-test and post-test is rejected (.02>P>.01); in all 
Hypothesis of equal cen 


š means 
cases it is rejected, between pre-test and post-test is accepted (р>. 05); inall other 
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trasting treatments, and (3) a one-way analysis ment.10 (See Table П). 


of variance and covariance offinal scores holding 


pre-test and Otis intelligence test score S con- 
stant. 


ern State Teachers College under either teaching 
method as regards: 


1. Mean initial and mean final performance. 
2. Mean final performance of upper, middle, 


ing no interaction between treatments and 
intelligence levels. 


3. Mean initial and/or mean final performance 
of boys and girl 
4. Mean final perf 


The results of these analyses showed that: 


For hypothesis One, generally, ( 
ject-matter Performance Was incr 
cantly in both treatments, (2) 
tion of Subject-matter did not p 


1) mean sub- 
eased signifi- 
differential reten- 


variances were not 
homogeneous The fulfillment of this assumption 
1$, however essential to the 


For hypothesis four, (1) 
the 
matter Performance of th 
od group Stude; 


ethoq group Students and 

ductive asermance of students in some uL 
ds d "ess T ud sk 

Eo Significant, r M entific thinking ability was 


i ferent under eithertreat- 
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A TABLE OF NORMAL DISTRIBUTION FRE- 
QUENCIES FOR SELECTED NUMBERS OF 
CLASS INTERVALS AND SAMPLE SIZES 


WILLIAM J. MOONAN* 
U. S. Naval Personnel Research Field Activity 
San Diego, California 


L Introduction 


eiu: PROBLEM of “‘fitting’’ a normal distri- 
DE n to an observed frequency distribution is 
an m treated in statistical textbooks. A re- 
ing f 2nd easier problem is concerned with find- 
s E Tequencies for a normal distribution with a 
a е mean, standard deviation and number 
ass intervals. The latter problem is fre- 
ae encountered by research workers in so- 
lect, ines fields when basic information is col- 
of ie from subjects who are asked to sort a set 
his we? phrases or sentences into categories. 
“fo Procedure is frequently referred toasa 
Tced-distribution" technique. 
deali an example, a researcher devises 40 items 
ma: ng with attitudes toward a Naval career. He 
Pee. Wish a sample of subjects who are either ca- 
item, 9r non-career prone to partition these 40 
in "xà into five groups such that the frequencies 
to th ch group corresponds as nearly as possible 
uti, Ose derived from anormal probability distri- 
en es function. In this way each item can be giv- 
to where which is associated with the category 
ich it is assigned by the subjects in each of 

y Career groups. The ultimate object of the an- 
“ы is to contrast the mean item score Банер 
Cri "DS in order to determine items which dis- 

inate effectively between the career groups. 
Or П Order to find the appropriate frequencies 
Shee Of the five categories it is necessary a 
the ‘ify the range of the scores to be applie : 

i Items and the widths of the class intervals. 
from соттоп practice to make the scores үү" 
ber ае to а number corresponding to th pum 
all c) Categories, inthis case, five, and p а 
We „255 intervals of equal size and equal to : 

Shall adhere to these conventions. With thes 


* 
A Opinions expressed are solely 
Чед as representing those of the 


of the author 
eg, Faval Personnel Researc 


specifications the rangeofthese scores will be 
five (i.e., 5.5-0.5), and because of symmetry, 
the mean score will be necessarily three. 

The choice of a standard deviation is not alto- 
gether arbitrary. For instance, a o = 10 would 
obviously be inappropriate for the problem pre- 
sented above. Certainly a o = 1 would not seem 
unreasonable, nor for that matter, would o - 3/4. 
What o should be chosen? There is an objective 
way of determining the answer to this question. 
The purpose of this note is to show interested 
readers how to find the appropriate o and to pro- 
vide a table of frequencies for various numbers 
of categories and sample sizes. In this discus- 
sion, sample size refers not to the number of 
subjects used in the study, but rather to the num- 
ber of items to be assigned in all categories of 
the distribution function, in other words, the total 


frequency. 


2. Determining the Frequencies 


It is generally known that the average range of 
scores of samples taken from a normal distribu- 
tion depends upon the samplesize. Various rules 
of thumb have been advanced to indicate how many 
o’s are included within the range for various sam- 
ple sizes. However, Tippett (4) in 1925 derived 
the appropriate statistical answer and his table of 
the ratio, Range/o for various sample sizes ap- 
pears in several places (3, 4, 5). Unfortunately, 
the existence of this table and its useforthe prob- 
lem at hand are not widely known. 

The table is entitled “Mean Range in Normal 
Samples of Size n.” As the title indicates, the 
table provides average values of ranges in sam - 
ples of size n from a normal distribution func - 
tion with zero mean and unit variance. By using 


and are in no way official; nor are they to be con- 
h Field Activity or Bureau of Personnel. 
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TABLE II 


A TABLE OF NORMAL DISTRIBUTION FREQUENCIES FOR 


SELECTED NUMBERS OF CLASS 
INTERVALS AND SAMPLE SIZES 


WITH SOME CORRESPONDING STATISTICS 
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TABLE I 


DISTRIBUTION 
PUTATIONS FOR DETERMINING NORMAL FREQUENCIES FOR A 
сүн WITH А RANGE EQUAL ТО 5 AND А SAMPLE SIZE OF 40 


Class 
Interval Midpoint Z(2) Ф[ z(2)] p(c) pp(c) f 
-œ to 1.5 1 -œ .0968 3.872 4 
1.5 to 2.5 2 -1.30 .0968 .2368 9.472 9 
2.5 to 3.5 3 - .43 .3336 .3328 13.312 14 
3.5 to 4.5 4 .43 . 6664 .2368 9.472 9 
4.5 to œ 5 1.30 .9032 .0968 3.872 4 


the following equation, we can determine the ap- 
propriate standard deviation for our distribution 
by solving for c. 


(1) Range loy 


where “Range” is the range we specify for the 
distribution and W is the table entry which, of 
course, depends upon n, the number of items. 
The table given in (3) gives W's for n= 2(1)499 
and for n - 500(10)1000. Tippett's table is use- 
ful for the problem posed since we can Specify 
that the average range desired should be that 
specified by the differencebetween the upper end 
of the largest clases interval and the lower end of 
the lowest class interval. In such an event, the 
table can be immediately used to determine g. 
For the example considered above n - 40 and 
W = 4.32156 or 4.322, say. Therefore о = 1.157 
Since the range was 5. The probability of an ob- 
servation from a normal distribution, X, 
less than or equal to the cth class limit is 


(2) P{X<Z(c) + 47/2} = O([ Z(c) + AZ/2-n] /o) 


and the frequency of observation belonging to the 
cth class interval is 


being 


(3) np(c) = пф([ Z(c) - AZ/2-y]/o - n @ ([ Z(c) 
- 42/2 - y]/o) 
= np(c)=nQ[ Z(1)] -пф[26(2)]. 


where 


n is the size of the sample, 

Z(c) is the midpoint of the cth class interval, 
AZ is the width of the class interval, 

p is the mean of the population, 

g is the standard deviation of the population, 


Ф is the standardized cumulative normal dis- 
tribution function. 


The numerical method of computing the final 
frequencies, f, will now be described. First of 
all, the class intervals and their corresponding 
mid-points are recorded in the firsttwo columns 
of Table I. Then the values of the mid- points, 
the half-width of the class interval and the popu- 
lation means are substituted in the expressions 
for Z(2) as defined in equation (3). This is done 
for all class intervals except the first (or small- 
est). Since the tails of the normal distribution 
extend to -со and +оо the first and last class in- 
terval are not finite. The computing procedure 
assumes that the first midpoint is not 1, but -00; 
and therefore Z(2) for this particular interval is 
-@. However, the computing procedure does 
utilize the 5 for the highest mid-point and com- 
putes a finite Z(2). In effect this amounts to con - 
puting a standardized variable for the point at the 
E E of the next to the largest class in- 

erval. 


Having computed the Z 
refer these val 


n Hald's notation) are p m 
(2. Kelly's table was used to compute the du 


for Table II. 


: e cum i istribution 
function for Z(2) = шше normal distr 


o 

s л The p(c)’s arethen multiplied by + 
eoretic i . 

MN work dar al frequencies, n p(c) jes be 


k 
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integers and therefore rounding the n p(c)'s isnec- 
essary. This rounding may result in a total fre- 
quency 1 or 2 units different from n. This means 
that the computer will have to exercise his discre- 
tion in adjusting the frequencies so that the total 
Sample size is n. This problem occurs with the 
data given in Table I. After rounding, the total 
frequency was 39 and this meant that a unit adjust- 
ment had to be made. In order to maintain Sym - 
metry, the frequencies on the flanks of the distri- 
bution must remain inviolate or be changed sym - 
metrically. It was decided to increase the 
Central frequency by 1. This yielded the f r equen- 
cies listed in the f column. Rounding and adjust- 
ing the frequencies means that the original c used 
to derive the frequencies will not be maintained. 
However, it will ordinarily be only slightly affect- 
ed. In this present case, the о for the final distri- 
puton is 1.118 whereby originally it was assumed 
O be 1.157. The computation of the final frequen- 
cies can be shortened by taking advantage of their 
Symmetry. For a small number of class inter- 
vals the saving is nottoogreat, however. Besides, 
calculating all frequencies serves as a checkon 
the calculations since symmetry must be obtained. 
_ Table II presents a set of frequencies for a 
Vide variety of class intervals and sample sizes. 
This information should be suitable for most oper- 


ational circumstances. Nevertheless, the general 
method for determining the frequencies is given 
so that special cases can be evaluated with ease. 
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THE USE OF PUPIL 


"Si Me ACHER'S behavior often is a function of the 
tain wa pupils respond. If so, he will adopt cer- 
upon ШО, of Teaching and reject others depending 
importa, pupils' reactions. In order to study this 
Sirable = effect of pupil upon teacher, it is de- 
Predete 9 manipulate the behavior of pupils ina 
Pausini on fashion. This study explored the 
Could b ity that such experimental conditions 
Apart = established by using pupil accompli c e s. 
dor pro this methodological inquiry, the ma- 
Hals d anie of this study was that teachers re- 
ment a iffer to the extent to which they find enjoy- 
ain in reinforcement as compared with pupil 
hat suum А A minor hypothesis was 
Other y a variable is significantly related to 
is “м behavior of the teacher; specifically, 
Verbal Te on a dogmatism test, his expressed 
Of Phr attitudes toward teaching, and his choice 
ases used in describing his pupils. 


G 
Seral Procedure 


mete teacher was instructed to 
Sequently + teaching pupils individually 27 
ion on y to select and use in continuing inst 
Priate Sor the two methods deemed most appro- 
Show ç Pupils were previously inst ructed to 
Sain QR Pater enjoyment for one method but more 
Sign 2 achievement by the other. А palanced de- 
Oys v used so that each teacher taught two 
ment > two girls, two of whom showed enjoy- 
enjo of the first method and two of whom showe 
ipae of the second. In this way the variable 
апа th ЧУ is independent of the method itself 
€ sex of the pupil. 


first usetwo 


ally and sub- 
ruc- 


Subjects 


in e Subjects of the study were 
ining enrolled in a teaching 


forty teachers 
methods class 


* 
e cooperation О 


Appreciation is expressed for th 
Angeles, 


Chool, University of California, LOS 


ACCOMPLICES TO IN- 


VESTIGATE TEACHER BEHAVIOR 


EVAN R. KEISLAR and JOHN D. McNEIL 
University of California at Los Angeles 


at the University of California, Los Angeles 
Each day for five days a different group of eight 
teachers appeared for an eighty-minute session at 
the University Elementary School ostensibly to 
“assist” in an experiment with teaching methods. 


Instructions to Subjects 


Upon their arrival at the school, these teach- 
ers were given a twenty-minute briefing. Instruc- 
e modified slightly on the third day and 
fourth so that the study essentially in- 
volved three different experimental conditions as 
discussedlater. In general, the subjects were 
told that they were to assist in a study of the ‘‘ ap- 
propriateness”’ of two teaching methods, avisual 
and a kinesthetic method, in relation to individual 
pupil differences. Teachers were advised that 
during the next hour four pupils would appear for 
tutoring, one at a time. Each pupil would bring 
with him two packets of cards on which were writ- 
ten particular spelling words s elected especially 
for him. 

The teachers were instructed to teach the pupil 
the ten spelling words in the first packet by the 
visual method and the ten words іп the second by 
the kinesthetic. During the session the pupil would 
be required to spell the words from the two pack- 

ds misspelled by the pupil were to 


tions wer 
again on the 


ets. Those wor 
pe set aside and taught in a third attempt using the 
ppeared most “appropriate. "After 


method which a 
each pupil was dismissed, the teacher was to re- 


cord the method chosen as appropriate for the child 
as well as his “general impressions”’ of the pupil. 
АП instructions in detailed form were both read 
and taken for reference by the teachers who were 
sent to individual rooms where they awaited the 


arrival of the pupils. 


ofthe University Elementary 


f the staff and pupils 
ho assisted with the study. 


and to David Kagan w 
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Pupil Accomplices and Their Instructions 


Eight sixth grade pupils were the accomplices 
for the entire experiment. Pupils were instruct- 
ed to cooperate with the teacher in all respects 
but to respond differently to the kinesthetic and 
visual methods. Half of the pupils were told to 
Show obvious pleasure and to misspell certain 
words during the presentation of the lesson by the 
visual method. They were to show no enjoyment 
of the kinesthetic method but to misspellfewer 
Words taught by this method. The other half of 
the pupils were told to do the opposite, i.e., to 
Show enjoyment of the kinesthetic method but to 
misspell fewer words by the visual. Ina pr elim- 
inary training session of approximately half an 
hour duration, each pupil rehearsed his role and 
practiced the exact words he was to m isspell. 
Each pupil was assigned daily to four teachers. 
ie each teaching situation, pupils reported to 

he experimenters the method selected by the 
teacher, 


Experimental Conditions 


ü As implied earlier, three experimental condi- 
HOnS were used during the study. In Condition I, 
in effect for the first two days, pupil accomplices 
Misspelled three more words by one method than 
he other, and the experimenters gave no 
sel ейі to the teachers regarding the basis та 
oe ecting the “appropriate” method. As indica À 
s Table I, the vast majority of the teacher p 
сыс га the method which resulted in en ra 
c lévement gain, Therefore, on the third day 
Ondition IT was established, w her ein pupil ac 
Complices misspelled only two more words by one 
end and the wording of the instructions ne 
= nged slightly to make the criterion of spelling 
Tonen less obvious. 
;,4:748much as the results under 
indicated a skewed distribution, for the last two 
ays an additional change was made in the 7А 
ег uctions. Неге, under Condition II, the eet 
n n Were instructed to judge the “appro pr and 
us ' of the method in terms of both ae 
ort term consequences” and what “woul i 
— for the pupil in view of his emotional E 
Ponses, achievement, mental stumbling blocks; 


Physical behavior and the like. ” 


Condition still 


Results 


iability among 


As indi i var 
icated in TableI, the Tewas not large 


the teachers under Condition I and 
Trough to justify a statistical test of the r Pa 
Of this measure. However, under бош 5 
tha ДАН of the sixty-four choices were we mm а 
€ direction of student preference, Watt” J 
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test of significance of teacher consistency. The 
scores obtained from the first and third choices of 
the sixteen teachers in Condition Ш were compared 
with their scores from the second and fourth. 
Twelve of the sixteen cases were found to be above 
or below the median on both pairs. Using the one- 
tailed sign test, the null hypothesis is rejected at 
the .05 level. This permits the conclusion that 
under Condition HI of this study, teachers reliably 
differ in the extent to which they find pupil enjoy- 
ment as compared with pupil gain in achievement 
the more important reinforcement. 

In order to discover the correlates of this meas- 
ure, those teachers in each of the three experi- 
mental groups who were above their respective 
group medians were compared on several bases 
with those below these group medians. No differ- 
ences were found between the high and low groups 
on the Rokeach Scale of Dogmatism.** Further, 
no differences were found with respect to the fre- 
quency with which the groups used task -cent red 
or emotional phrases in their descriptions of pu- 
pils, these phrases having been so classified by 
four judges. The two groups of teachers were al- 
so compared with respect to their answers on fif- 
teen two-choice items dealing with attitu des 
toward teaching. Only one difference significant 
at the required .001 level was found: When asked 
to choose the better solution to parent criticism 
of the school, teachers who were more influenced 
by student preferences checked co o perative plan- 
ning and improvement of interpersonal relations, 
while those teachers who were influenced by pupil 
gain in spelling checked indepe nd ent study and 
decision-making by school authorities. 

Questionnaires administered to the teachers at 
the conclusion of the study revealed th at only one 
teacher of the forty entertained the possibility of 
coached behavior on the part of the pupils. Prac- 
tically all of the teachers regardedtheir participa- 
tion a realistic and valuable experience. , Most of 
the successful control of this teaching environment 
was attributable to the role-playing ability of the 
pupil accomplices. These pupils played their 

arts with zest every day, their enthu siasm on 
the fifth day being as high as on the first even 
though they were each completing twenty such 
teaching sessions Agreement ofreports from 
upils and teachers concerning the method select- 
pup fect, pupils andteachers agree- 


almost per A 
furem all but one of the one hundred and sixty in- 


dependently made reports. 


Conclusions 
Сопс из = 


ш the teaching of spelling where nocriteria 
for selection of method are sugges te d, most 
teachers are influenced more by pupil spelling 
gain than by pupil preference. This is probably 


ternative to the Authoritarian Personal- 


ons Dogmatism: An Al 


** N 
Rokeach, Milton, “Politi d Religi 

OUR y n, “Political an ied, Vol. 70, 

ty,” Psychological Monographs: General and Appli 


No. 18, 1956, pp. 1-43. 
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because spelling to most teachers has always 


been associated with the criterion of achievement. 


On the other hand, where multiple criteria for 
selection of method are mentioned, teachers dif- 
fer reliably in the extent to which they are influ- 
enced by pupil enjoyment as contrasted with pupil 
gain in achievement. While it is tempting to 
speak of these two groups of teachers as being 
either ‘‘child-centered’’ or “© subject-centered my 
no such interpretation has been drawn since these 
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terms themselves are in need of clear definition. 
There were many teachers who yielded to pupil 
preference in the experiment, yet in their com- 
ments gave the clear impression of placing more 
importance upon subject mastery. 

It would appear that pupil accomplices can be 
effectively used in establishing experimental con- 
ditions for the study of teacher behavior. The tech- 
nique has meritas a research tool, making possible 
realistic but controlled classroom conditions. 
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A CONTROL CHART FOR ERRORS IN IBM 
TEST SCORING MACHINES 


HARVEY F. DINGMAN, WILLIAM G. HOYT,** KENNETH F. THOMSON 
Personnel Research Branch, TAGO 
Department of the Army 


ius RH test papers with the IBM Test 
in humidit hine, it is well known that variations 
Swer sheets voltage, and age and condition of an- 
Machine. за effect the scoring response of the 
inne. coe variability in the scores 
lowed by ui. form of checking procedure is fol- 
Checking ц ost users of the scoring machine. The 
ing of cm consists of a complete rescor- 
of spot we or spot checks. The extensiveness 
lon and i ecking depends in part upon local tradi- 
quired he part upon the kind of precision re- 
Costs m r the job at hand. Checking of scores 
trol а or time as does any other quality con- 
nor “sss Further, neither local trad 
9 Provide itrary “kind” of precision is en 
nomical € information sufficient to achieve eco- 
Since Oting checks. 
analogous ne OE answer sheet 
it was ele inspecting industri 
quality co ided to investigate one o 
P oe devices used in industry. | 
ans for (1) has proposed a number of sampling 
Swer she analysis of errors in scoring IBM an- 
all's ра ets. However, none of the plans in Kim- 
cal Deere are presented on the basis of practi - 
Present = ne It is proposed in this paper to 
at seem hecking procedures based upon regions 
Ina st rey empirically. 
Of IBM Pepa OI the comparabili 
five maona 80866, Ше agreem 
Next page ine scorers were tabulate 
ect agre .) Thus, while there is fa 
800d agr ement among the scorers, ver 
Used NF ant if the criterion of ż 1 point is 
the me efine agreement. Asamatter of fact, 
anq thet number of agreements + 1 point is 96.0 
Sine tandard Deviation is 1. 95 agreements. 
group at this scoring was done from а random 
fully 9 papers and the scoring was done care- 
» 96 percent agreement (ag r eement between 


ition 
ough 


scoring is exactly 
al production 
f the statistical 


ty of hand scoring 
ent between 
d. (See top of 
r from per- 
there is very 


* Th 
guns I 
Opinions or conclusions contai 


str š 
ued as reflecting the views or indorsement 0 


**H Cal 
: F. Dingman now at Pacific State Hospital, Pomona, 


opm seni 
PMent Corporation, Santa Monica, California. 


d in this report 
ПЕ f the 


two scores within + 1 point) seems to i 
able degree of agreement between а с 
seems ап acceptable standard of agreement to use 
to require as a criterion of acceptability for a set 
of scores derived from rescoring of IBM answer 
sheets. In different situations, different stand- 
ards will be necessary. In very shorttests where 
important decisions are to be based on scores 
within two or three points of each other, obvious- 
ly more rigorous standards mustbe used. In the 
situation where there arelarge samples,long 
tests, and procedures are somewhat flexible, 96 
percent agreement seems adequate. For practi- 
cal considerations, one would not care to use a 
scoring that agreed less than 90 percent with an- 
other scoring. Actually, 2-1/2 S.D. down from 
96 percent agreement is 91 percent agreement. 
This seems to be an acc eptable bound for reject- 
ing scorings. If a group of scorings of papers 
from a large sample showed agreement between 
91 percent and 96 percent between two scorings of 
the same paper, more papers would be examined 
before accepting or rejecting the scorings of the 


larger sample. 

If we consid 
sample from a p 
score the paper 
paper falls in the 


er each test paper as a random 
opulation of papers, and we re- 
to see if the original score on the 
limits of error (+ 1 score point), 
then we make a yes-no decision whether to accept 
the score on the paper as correct or not. For a 
group of such papers we could use the binomial 
distribution to test the hypothesis that the popula- 
tion from which the group came possess ed the 
quality, **96 percent of all the papers if rescored 
would fall in the limit of error (+ 1 score point)" 
The sequential probability ratio test for the 
mean of à binomial distribution as presented in 
provide a solution to the prob- 


Wald (2) seems to 
If we let Po be the lower boundof the propor- 


lem. 
tion of defectives in an acceptable group, and P, 
are those of the authors, and are not to be con- 


Department of the Army. 


ifornia; W. G. Hoyt now at Systems Devel- 
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Errors in Scoring 
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FIGURE I 


CONTROL CHART FOR ERRORS IN SCORING 


X*——X Rejected Sample 
9— Accepted Sample 


40 50 60 70 80 90 100 
Papers Inspected (in Batches of Five) 
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AMOUNT OF AGREEMENT AMONG FIVE MACHINE SCORERS 


Number of Perfect Agreements 
Out of 100 Papers* 


Scorer 
Scorer 1 2 3 4 
1 
2 55 
3 52 70 
4 83 55 57 
5 55 15 15 59 


*Test is a 60 item test for which the scoring formula is R - 1/3W. 


Number of Times Scorers Agreed 
Within + 1 Point on 100 Papers* 


Scorer 
Scorer i оу 8 F 
1 
2 95 
3 94 95 
4 99 98 93 
5 97 98 94 97 


The 100 papers were selected 


аѕ a random group of papers that routinely come in from the field. 


к Hie Upper bond of the preceeding paragraphs 
fact -04 and P, = ‚09. The true proportion de- 
ic ae Po will vary considerably. For econom- 
im, 380ns such as keeping the rescoring to amin- 
first ( the limits of confidence for errors of the 
the fi a) and second (8) kind should be placed at 
Son os Percent level. There seems tobe no rea- 
er than СЇ one confidence limit (a or В) high- 
to rej the other as one should be equally willing 
acce s Papers (as poorly scored) as willing to 
ever: | the scorings. This is not always true how- 
le ne ìn Certain circumstances suchas preparing 
веды 8 for an important test, it would be ч 
е ues err in the direction of failing to accep 
ng of a sample of papers. 
Ка ma 2), ina ей о of the formulae ror 
tion oting for the mean of a Binomial distri 4 
inine eSents graphical procedures for the we 3 
ure 1 IN the acceptability of a sample. In ig 
betwee © cumulative number of disagree m am 
en th two scorings are plotted for each ae 
ments i € line of plots of cumulative disagr ә 
e sar SCOring crosses the rejection line, ve 
scored Pi of papers should be rejected as poorly 
men, ;. the line of plots of the number of a 
Sam? Сг08зез the acceptanceline, we accept Ue, 
does а8 correctly scored. If the line of plote 
line not cross the acceptance line or rejecta 
all the ne must keep on testing and plotting un 
© papers in the sample are checked. - 
ве Mains Average Sample Number function of ^ 
Procedure is computed it will be seen that 


a large Sample of papers has 96 percent of its 
papers with correct scores, on the average only 
140 papers will need to be re-scored in order to 
accept the sample as correctly scored. 


Summa: 


A procedure has been presented for providing 
economical checks of the accuracy of IBM answer 
sheet scoring. This procedure was devised to 
routinely check samples from large groups of 
fairly long tests that had been administered and 
scored in field units. The procedure consists in 
constructing and maintaining a control chart for 
all errors in scoring greater than one point. If 
the sample indicates that the percent of acceptable 
papers in the group is 96 or greater, then the 
group is declared acceptable. 

Other situations that require more (or less) 
rigorous standards can easily construct their own 
control charts by consulting Wald (2). 
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A SUPPLEMENTARY NOTE ON "MAIN EF- 
FECTS AND NON-ZERO INTERACTIONS 


IN A TWO-WAY CLASSIFICATION" 


RAYMOND O. COLLIER, Jr. 
University of Minnesota 


Sa con, AUTHOR wishes to thank Joe H. Ward 
ror in нан; A. Bottenberg for pointing up an ег- 
to look is paper [1] and alsofor motivating him 
that pa more basically at the problem with which 
ere pe was concerned. It was maintained 
BA іп equation (15) that, under the hypothesis 
i7 0, with side conditions Z j- Z 7ij=0, 
times J ior] 
9Sidual sum of squares was 


(1 
| SS' (Е) = SS (Е). 


Оро 
оша Parts the minimization process it was 
at the correct result should have been 


(2 
55' (E = S S (Е) + SS (a) 


Thu 
ong ue test of Ho: с = 0, as referred to in[1], 
With the above side conditions does proceed 


S us : 
ual and is made by means of 
(3) 
rp- SS(a)/(-1) 
S S (E)/IJ (K-1) 
T 


ed Problem can be viewed more basically, 
[1], ке, and the question asked, “Referring to 
For ctly what is being tested in Ho: ai= 


answer it i à RE is 

58 с tilize the c 

Pt ES Simms е tou 

( nsider the following model: 

4) х.. 

vs ijk= M + Ci + Bj + Vij + €ijk: 

ere 

random, к ae B, d are fixed effects, the Eijk are 

La; effects as defined in [1] and i = ae T 
el 


ја 


e. k=1,2,...,K. The expec 
ijk is given by 


E (Xijk) = p + ai + Bj + Vij = Šij . 


Ow (4 
= yp сап be written in matrix form as 
E where X is (IJK) x 1, Y is (IJK) X 


x z 


(IJ+1+J+1), Bis (IJ+1+J+1)x land E is (IJK) x 1. 
It can be shown that the rank of Y is IJ, which im- 
plies that just IJ linear functions of the parameters 
р, «, 6,7, in B are estimable. (Alinear func- 
tion of parameters is said to be estimable if it is 
possible to construct a linear function of the obser- 
vations, an estimate, which is unbiased.) 


Three theorems regarding estimability are pro- 
vided by Kempthorne [ 2:78] which will aid in this 
problem. They are (a) any linear function of the 
parameters in a linear model is estimable if it is 
a linear function of the expectations of the observa- 
tions; (b) any reparametrization leads to the same 
estimate of an estimable function; and (c) itis pos- 
sible to test hypotheses only about estimable func- 
tions. 

Utilizing (a), it is obvious that £ ijis estimable 
and so therefore any linear function of €jj is al- 
so estimable. However, nolinear function of the 
aj in (4) is estimable unless restrictions are im- 
posed on the parameters, i.e., they are reparam- 
etrized. It is clear, then, from (c) that the hy- 
pothesis Ho: %j = & in (4) isimpossible of being 
tested. 

If one labels the reparametrized aj of [1] as 
a and notes that the estimate ofaj was given as 
& = Xj. ./JK - X... /JK, it is possible to invoke 
(b) above and ascertain just what is being estimat- 
ed by бү. It is seen that the expected value of 
at is i + fi. /J ` «./l- 3../jj and that more- 
over, the hypothesis, Ho: Фі = 0 from [1] is in 
reality the hypothesis Ho: aj“ = 0, equivalent 
to Ho: qi + 7i- j*9 3 *-- ` 

This last hypothesis is exactly the hypothesis, 
Ho:€i-/3 7$, Where 6 = €../1j so that we have 


Ho:£i./J = 6 Aur 


Summary 
sues 


The point to the above rem arks is as follows: 
In a two-way classification, assuming the interac- 
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tion model of (4) a test of the main effect hypothe- 

sis Ho: “i= is impossible. That which can be REFERENCES 
tested is Но:®{ ‚/у=&.. Ay whichis identical to test- 
ing Bo: Sd ote а constant = &../j -7 ../Tj 
and is made by means of (3) as usual. The test of i i 
the hypothesis in the reparametrized model is ex- 10 558) Hg meum Education, 
pressly a test of the two immediately preceding hy- 2 р, 


Dotheses; 2. Kempthorne, O. Design and Analysis of Ex- 
periments (New York: John Wiley, 1952). 


Im 


. Collier, R. O. “Main Effects and Non-Zero | 
Interactions in a Two-Way Classification,’ 


f 
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TEST NORMS AND 


SAMPLING THEORY 


FREDERIC M. LORD* 
Educational Testing Service 


Princeton, 


Summary 


" TEST NORMS usually constitute a set of sam- 
Ne parece (percentiles or percentile ranks) 
eei the test consumer to make inferences re- 
Beni ing a norms populationfrom which the norms 
ir was drawn. In such cases, itis impor - 
the That the selection of the norms sample and 
as C Ont statistical treatment shall be such 
th 9 minimize the inevitable sampling errors in 
€ published norms table. 
one least in the case of tes 
me rges, the problems of OP 
toma and optimal statistical p 
he iderably complicated by the fact that 
аен norms table is mos 
Sam e as representing а group of 
is Ria ing unit used in drawing the n 
er tma the school. Since schools us 
Sam uma] from each other in mean score, 
пар 16 errors in the final norms table will or- 
the rily be large unless the number of schools in 
dente: rms sample is large. The number of stu- 
е іп ће norms sample typically has only à 
ling and indirect relation to'the size of the Sam 
In | regm in the norms table. 
Sampl e most common method o 
I tested а all students in each sc 
| Are at eg included in the norm 
| Stitutes e proper grade level or 
li simple cluster s am pling. 


A sample i 
| Erm ed estimate of the m 
trom th An unbiased estimate © 
tor ( e sample data (eq. 3) 
| to th eq. 4) is likely to be so 
2 E of the mean of the nor 
timat ) as to preclude the use o 
€ in most situations. | 
Slow nee ea examples for two typical situ 
twely hat under simple cluster samplings 
teste to thirty times as many studen 
the Ca in order to obtain the same accu 
Ble HS table that would be obtained un 
andom sampling of students. Un 


ts used in our 5 
timal sampling 
rocedures are 
whereas 


{ drawing à norms 
hool selected are 
5, provided they 
levels. This con- 
In such sam- 
ally 


, 


í the unbiased es- 


ations 


* 
т aces е 
е writer is indebted to Professor Francis An 


Copyright 1959 by 


scombe for helpful 


New Jersey 


simple random sampling of students, without re- 
gard to their “clustering” in schools, is ordinar- 
ily impractical. The best practical procedure to 

avoid the excessive sampling errors characteris- 
tic of cluster sampling is to use two-stage sam- 
pling: а sample of schools is drawn in the usual 

fashion, and then a random sample of students in 

each school is selected for testing. 

At the lower educational levels, two-stage sam- 
pling may often be feasible only when two or more 
tests can be normed simultaneously, in which case 

only а fraction of the students in each classroom 
need take any one of the tests. Atthe college level, 
it is usually impossible to arrange for the norma- 
tive testing of all students at a given grade level in 
a variety of institutions. In this situation, the 
common procedure is to test as many students as 
possible in a relatively small number of colleges. 
Not only does this increase the sampling errors 


in the published norms table unnecessarily; it also 
sample of students from being a ran- 


since any sizable £r oup of students 
en time of day ordinarily ex- 
cludes those with conflicting scheduled activities. 
In order to obtain a truly representative norms 
r colleges, it will ordinarily be nec es- 
o few students in each college that 
an be found when all students chosen 
titution can be scheduled for a single 


S 


two-stage sam- 
the question of 
obe used for 
ulation 


ways of drawing 


the choic 
estimating th 


Some nume rical exam- 
ingthe economies 
ge sam pling instead of 
, andin Table III illustrat- 
of various two-s tage 


methods. 
ethod is given 
ulas can be use 


by means of which 
the same form d to provide stand- 


comments on 4 draft of this paper. 


Dembar Publications, Inc 
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for each percentile rank in the norms 
ae шакып as for the mean of the norms distri- 
bution. A numerical example in Table H indi- 
cates that the advantage of two- stage sampling 
over cluster sampling holds for estimating per- 
centile ranks as well as for estimating arithme- 
tic mean. 

Methods of obtaining school-mean norms from 
two-stage sampling data are briefly discussed, 

It will ordinarily be important to Stratify the 
norms population on certain School ch ar acteris- 
tics related to test score, such as geographical 
region and type of Support (public, 


representing the achievement 
of answering а Certain number of these items cor- 
rectly. More commonly, however, t 


Population is small 
Seniors gr adu- 


interpretation, In this sj 


discussed here, 


: ampl i 
ods will lead to norms that are both атеш 
nomical and mor, 


-a year or two later; (iii) such a group is roughly 


at 
lation lead many thoughtful persons to urge th 


. that ina large 


ose in current use. А 
à Most nationally distributed tests are ud 
With some sort of **national" norms. A s. 
Wide norms population has certain wisi s: Ro 
the population can be readily and clearly de 1 m 
for the test-score consumer; (ii) most — 
tested are likely to be members of this group, Mg 
of some presumably very similar group follow 


equally appropriate for publishers in various pani 
of the country, thus in theory making test no 
comparable from publisher to publisher. ap 
One main trouble with using a nationwide d^ ral 
as a norms population is that in most cases ed 
utterly impossible at present to test a Vicar A 
tive sample of such a group. Many schools, kd 
example, are simply unwilling to test. In бе 
case, the norms population can, at best, er fd 
described as consisting of “schools that are de 
ing to test.’ This description, moreover, is 2 odl 
inadequate one, since the willingness of a sc de 
to test varies widely depending on the deman 
made and on the inducements offered. opu- 
„Тһе disadvantages of а ‘national’? norms р 


d. 
the use of such populations should be ab pep om 
It seems likely to the writer, however, tha Lu 
future will bring increasingly effective op dcn for 
Obtain representative national norms, at leas 
a few of the more important tests. j 

. The present gener педа explicitly with — 
Sampling problems that arise when an npn 
made to obtain aptitude or achievement test eee? 
representative of students in those schools Шо 9t 
Out the nation that are **willing to test. "' es 
the principles to be discussed will have obvi 


re E ula- 
applications in obtaining norms for other pop 
tions. 


Sam; 


pling Fluctuations in a Simple Case 
ТЕР ШЕ Fluctuations in a Simple Case 


at 
Before discussing the sampling pr эрешен 
агіѕе іп obtaining “national” norms, it w ose 
helpful to consider a Simpler situation. Seine 
university it is decided to а hmen; 
ter a certaintest to а random sample of pom the 
and let it be assumed, for convenience, "iud о 
Sample is small enough and the total п formu- 
freshmen is large enough so that the usual may 
las for sampling from an infinite population 
be applied. и n 
The “norms” thus obtained for the inl 
at this university will consist of a tables alter- 


А test 
Wing the percentile (i.e., the The 


Each numbe 


rror 
he norms table has a standard € 


| 


K. 


LORD ; 
49 


that indi 
ктш the amount of this sampling fluctua- 
"Nee м wur kinds of standard errors may be 
Organizing ipei Songs to the different ways of 
We Seba s e гшаи for exam- 
will obtain freshmen are tested, 50 percent 
100 ie nd raw score below 20. If there are M = 
nomial for «b the sample tested, the usual bi- 
percenta; mula gives the standard error of the 
error indic as Y(.5)(.5)/100 = .05. This standard 
samples ud that if many different random 
is univer UD freshmen were to be obtained at 
norms inde D about two-thirds of the resulting 
tween 45 = would assign a percentile rank be- 
Percent | 55 to a test score of 20, and about 
Percentile of these norms tables would assign à 
The oth rank between 40 and 60 to this score. 
in the Bii. type of standard error is expressed 
units can c ofthe score scale. Although these 
equal, th ertainly not be considered as strictly 
йет, Sone: ee considered to be more 
Sive DEM. es than are the units between succes- 
is seco e pea ranks. To obtain an example of 
e Standa kind of standard error, suppose that 
men ЖОШ deviation of the scores of all fresh- 
Ples of M ` Be Sx = 10 if all were tested. In sam- 
Ple mean = 100 from sucha population, the sam- 
Sx/V MST will have a standard error of S.E.x = 
Sàmple m. From this it is concluded that the 
Зоре амар ДП ФАП женене score point of the 
Within two mean in two-thirds of the samples; and 
реге о, "core points of the population mean in 
tae of the samples. 
lon with S means are often present 
ually give orms tables, the norms tal 
SUming , "Pl the fiftieth percentile or 
the sta sd à normal distribution, it is known that 
SEM а error of a median is 25 percent larg- 
Standard e standard error of the mean. The 
Obtainag еггог of other percentile point? can be 
multipl from the standard error of the mean by 
16. 1 к by the factor Vp - p)/2 [cf T» 6d 
elow i p is the proportion of cases above 
Ponding у Percentile point, and z is the corre- 
found x, rinate of the normal curve. It is thus. 
Standard ыг the present numerical example the 
ae the fi; rror for the median is 1.25 score points; 
he гре 1781 and third quartiles, 1.36 points; for 
fifth and and ninth deciles, 1.71 points, for the 
or the Hñ inety-fifth percentiles, 2.11 points; and 
Points. “St and ninety-ninth percentiles, 
In 
Stricted at follows, attention will at first be re- 
е ба the standard error of the mean- A sim- 
for obtaining the standard error of any 


ed in conjunc- 
ble itself us- 
median. 


asy reference 


for 5а 
titut 


* 
: Summa 
Te 
Sed 
Schoo i IM population values, sm 
re for this purpose considere 


ry of notation is given for € 
all letters 


d to cons 


percentile rank will E А 
tion. ill be described in a later sec- 


Cluster Sampling 


From a logical point of view, the fundamental 
unit for norms purposes ordinarily is the individ 
al student (an exceptionis ''school-mean norms A 
which will be discussed in a later section) Ti 
practice, however, the unit of sampling us ed in 
the development of national norms ordinarily has 
been the school, not the individual student. The 
author or publisher typically attempts to obtain a 
“representative” set of schools; he then tests all 
students at a given gradelevel, orina given course 
in those schools. 

This type of sampling is called cluster sam- 
pling. It is in many cases a natu ral and reason- 
able way to obtain a norms sample, especially at 
the elementary and secondary school levels, where 
it is likely to be easier for a school to test all stu- 
dents in a given grade or course than to test some 
fraction of these students. The fact is frequently 
overlooked, however, that the elimination oflarge 
sampling fluctuations from the norms table re- 
s not that a large number of students shall 
been tested, but that a relatively large num- 
hall be represented in the norms. 
on can be made clear by giving 
he standard error of the mean 
ling and by a numerical ex- 


quire 
have 
ber of schools s 
The exact situati 
the formula for t 
score in cluster samp 
ample. 


If a random sample of n schools is drawn at ran- 


dom from a population of schools willing to test, 
and if all Mi (i = 1,...,n) students inschool i who 
meet certain conditions (who, for example, are at 
a given grade level)are tested, then the mean of 
the resulting norms distribution is exactly equal 
to the weighted average of the school means, the 
weights being the frequencies Mi: 

n 

> Mifi 

E a) 

n 

> Mi 


i= 


where y' is the quantity defined by (1), being equal 
in the present case to the mean of the norms dis- 
tribution, and Yj is the mean score of the students 
tested at school i.* The quantity Mj will be re- 
hool i," although strict- 


ferred to as the ‘‘size of schoc c 
ly speaking it is usually the size of a single grade 


at school i. = 
In order to write down the standard error of y', 


is article. In general, capitalletters 


at the end of th | i 
mple values. The students in a given g rade in 
ea (finite) population. 
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certain relevant details must first be specified: 


1. It will be assumed that the total population 
of schools is large compared to the number 
of schools in the s ample randomly drawn 
from it. 

2. Each school in the population has an equal 
chance of being included in the norms sam- 
ple. As will be seen later, this is not al- 
ways the best way to sample, but it is a na- 
tural way, since under this procedure each 
student in the schools in the population has 

an equal chance of being included. 
3. The number of Schools to be included in the 
sample is specified in advance. (An alter- 


this special case (1) becomes 


, (1) 


where S(Y;) is Simply the sta iati 
the school means in the no pee tum ч 


sample at hanı 


being of equal Size incr 
above that in (077, 


throughout the nation, excluding sc 


than 10 sixth-grade Students, give the estimated 
values SM = 32, М = 58; similar data for 426 tenth 
grades shows SM = 91, M = 108; similar data for 
208 thirteenth grades shows SM = 700, M = 531. 
Although these numerical results cannot be taken 
8S exact and final, they are sufficient to rule out 
the assumption that variation in School size is 
Small compared to its average value. | 

If Mj and Yj are considered tobe two stochastic 
variables, a numerical value of each being associ- 
ated with each school, then y' can be written as a 
function of moments of these variables: 


y' = m /ml, 


where m' is the usual symbol for a raw sample 
moment. Application of the usual methods for ob- 
taining large -sample standard errors [ 4:208 ff] 
then yields the following approximate result (pre- 


Sented here for completeness; the reader m ay 
wish to skip it): 


См + 
SMS’ (Y) 
gi. 300] св. а 
s (Yi)SM 
205, 3 204 , 
sus) МР J 


where CM = SM/M is the Co ef ficient of variation. 
for school Size, o is the population correlation Dee 
tween school size (M) and average school achieve 


== N a i 
ment (Yi), Hee = È (Mi - (ү, gy is а 
fourth-degree moment for the N schools inthe pop- 


N = 
ulation, py, = Z(M; - M)(¥; - Moi)?/N, and pa 


duces to equation 2', in 
good approximation to (2) can be obtained z 
most cases by assuming that observed scho ВЕ 
achievement is totally independent of school siZ 
Mollenkopf [5], using four achievement tests Em 
Tee aptitude tests, found a median correl ati x 
of .00 between School size and mean score in in 
ninth grades and a median correlation of .01 s 
106 twelfth grades. Zero correlation is not th 
Same as total independence, but the latter may 


der this assumption, po. = (Mj - M° 
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р 2 B 2 au 
Dor ИМ? = SMS (Si), pa = pa = 0, and (2) сап 


S. E. * (y!) stg Fa + CM) (2") 


Dismiss example will throw some further 
ror, e general magnitude of this standard er- 
ede sake of the present and subsequent nu- 
Sume th vá ay it will be convenient to as- 
Ө оү» pe test scores are expressed ona 
standard d ized scale with a mean of Y = 50 and a 
tion of eviation of Sy - 10 for the total popula- 
and athe ee An examination of norms tables 
achiey er data several different aptitude and 
the cae tests at grade levels ranging from 
ard don to the fourteenth has shown the stand- 
UM eri of school means to range from around 
ard fare up to around six-tenths of the stand- 
Stands iih of all students’ scores. Hence, the 
aken a, deviation of school means will here be 
to be 28 10шг points [S(¥i) = 4]. И CM is taken 
tioneg dnd = .55, as in the sixth-grade data men- 
norms t Ove, the standard error of the mean ofa 
8 scho able based on a cluster sample of (say) n = 
Proxim ois MERI M equation 2', be ap- 
Son iue a 4V(1 + .552)/36= .76. In compari- 
to the s same standard error, 0.76, would apply 
Y rand ean of a norms table based on a complete- 
а sam Dey Sample of 173 students, since for such 
0.76. Ple the standard error would be 10// 113 = 


I , | 

tenth is taken to be 91/108 = .84, as in the 

ing st 8rade data mentioned above, the correspond- 
andard error for cluster sampling would b° 


a ә 
star oximately 4V (1 + .842)/36 = .87. This same 
table ard error would apply to the mean of anorms 
only eased dom sample of 
eoa students. | 
) sti fixed number of students, the (statisti 
абуср21епсу of cluster sampling of schools T€ ^ 
ient, 9 random sampling of students is conven- 
e numbers 
in order 


D Sita that would have to be tested 

each ona given size of standard error under 
Sixth, us, 2160 
i each of 
chieve bY 


S; f 
e Mpeg efficiency of cluster sampling | 
etti, 173/2160 = .08. In the tenth ErZ di 
Сань CY Of cluster sampling is only .025, indi- 
mu at about forty times as many students 
Woulg © tested when cluster sampling i$ used than 
is ettic Tequired under simple random sampling: 
9wer. Ciency at the college level WO d be even 
; Th š 
bicture loregoing discussion should give а c 
of the relative (statistical) efficiency 0 


cluster sampling for norms purposes. 

is not to urge that cluster seis es of bere epum 
norms purposes be abandoned in favor simple 
random sampling of students, sincethis is clearly 
administratively infeasible in most school situa- 
tions. Rather, the purpose is to emphasize that 
it is the number. of schools, not the number of stu- 
dents, that determines the reliability of the typical 
norms table. In contrast, itis usually the number 
of students, not the number of schools, that pri- 
marilydetermines the cost of norming a typical psy- 
chological test. It is thus both economically and 
statistically important to find practical ways of in- 
creasing the number of schools tested while hold- 
ing constant, or possibly decreasing, the total 
number of students tested—thus increasing the ac- 
curacy of the norms without increasing the al- 


ready high cost of obtaining them. 


How Much Sampling Error Is Tolerable 
in a Norms Table? 


Before going ahead to discuss possible improve- 
ments, some consideration may be given to the 
practical effect of the sampling errors of norms 
tables and to the question of reasonable tolerances. 
Is the standard error of .87 found in the tenth- 
grade numerical example small enough to be toler- 
ated? The answer to this question, of course, de- 
pends on the use to which the norms are being put. 

If the test under discussion has areliability of 
.91 for the entire norms group, its average stand- 
ard error of measurement is 10V1-.91 = 8.0.. If 
the score of an individual student is being interpre- 
ted, an error of .87 in the norms table is small 
compared to the standard error of measurement 
of 3.0. Thus in this particular case, with the ex- 
aminee near the middle of the dist ribution, sam- 
pling fluctuations in the norms tables will affect 
the interpr etation of his test score muchless 
than will sampling fluctuations due to unreliability 
of the test. Foran examinee with a score near an 
extreme of the norms distribution, the situation 
is somewhat different, since fluctuations in the 

much more severe near the 


norms tables are very n є 
extremes, as already discussed іп some detail. 


A comparison of the sampling fluctuations in 
norms tables with sampling fluctuations due to test 
unreliability is likely to be mis leading. Errors 
of measurement Vary at random from one individ- 
ual to the next and hence tend to cancel out in any 
work with group means; the same norms table is 

dinarily used for all the individuals tested, how- 
poris and hence errors in the norms tables do not 
cvacel out but are repeated time after time. 

Suppose that separate norms are to be present- 
ed for northern and for southern schools; suppose 
that the norms sample consists of 34 schools, 25 
of which arein the north and 9 of which are in the 

th; and suppose that S(Yi) = 4 and Cy = .84 
within each region. By equation 2'', the standard 
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mean of the norms distribution is 
ribs eats and 1.7 for the south. The stand- 
aval error of the median of each norms Thee 
tion will be about 25 percent larger, or 1.3 an 
2,2 respectively. The standard error of the dif - 
ference between the medians of the two norms ta- 
bles is у (1. 3)* + (2.2)? = 2.5 approximately, 
If the true difference between the medians of 
the norms populations is actually 2.5 score points 
in favor of thenorthern region, there is one chance 
in 6 that the medians of the published norms will 
reverse this direction of relationship. Ifthe true 
difference between northern and Southern norms 
populations is 5 score points, there is still one 
chance in 20 that the published norms will present 
the opposite of the correct picture. 
This example illustrates the more dramatic 
type of error that might result from an inadequate 
norms sample. Actually, of course, it is hardly 


A Similar situation Would exist if t 


all and s 


both occasions. 
The foregoing considerations 


lowable tolerances for Sampling fluctuations in the 
norms. They Should, however, indicate Some of 
the quantitative considerations th 


Biased and Unbiased Estimates in 
uster Sampling 
——— оерп 


tion. The nature of t 
ent from a considera: 
where the number of 


orms sample 
atever school 


is randomly chosen from the norms prn 
The expected value of the mean of the norms 
ple over a large number of similar samples is 


N_ 
Е(Ү) = pb, = (2 YN , 


where N is the number of schools inthe norms pop- 


К id 
ulation. Since the mean score in the norms pop 
lation is 


the expected value of the mean of the norms чай 
ple is not equal to the mean of the norms popu Б 
tion unless either (i) all schools are the same т 
Ог (ii) school size is uncor related with scho : 
achievement. The bias is of order 1/n, and bud 
tends to disappear when the number of schools 1 З 
large. (It сап be shown by expanding (1) in a Bri 
lor's series and taking expected values that is 
School size were normally distributed (which it ^ 
not), the bias when n is large would be bon 
mately EY' - Y - CMS(Yi)p/n, where Ey' is lé 
expected value of the mean of the norms gon E 
and p is the population correlation between scho 
Size and average school achievement. ) the 
An unbiased estimate of the mean score of Ше 
norms population can be constructed [rom d 
Same data that were used to construct the D p 
estimate of equation 1. The unbiased estimate 1 


$. (3) 


= ion 
(M, the mean school size in the norms populatio 


15 considered as known in advance, sothat it need 
not be estimated from the sample. ) hool 

Under the assumption that school size and sc ТОП 
mean score are unrelated inthe norms population, 


: te 
the sampling variance of the unbiased estima 
for large n is found to be approximately 
=, =, lx = (4) 
S.E.*(y") =58.к.(#уу1сг ys , 
у") = 8.8.29) 10%, 


It is clear that 
achievement ar 
mate has a lar, ве 
ased estimate or (1), except in the special ca 
where Y = 0. T 

the choice of the 


ment used to report test results is an arbitrary 
Choice, The sta 


mate could th 


LORD ds 


dnd frequently will be 50 or 500 if scaled scores 
= used, With data of this sort, the sampling 
a or of the unbiased estimate becomes huge. In 
е case of the numerical example for the sixth 
Io with Y = 50, the standard error is found 
: m (4) to be V. 76" +. 552 x 507/36 = 4.16, in 
ontrast to a standard error of .76 for the biased 
estimate y! t 
be Mes true nature of the unbiased estimate can 
the early seen in the following example where 
Шаг ad consists of only one school. Suppose 
Size y Chance the school chosen is about half the 
Ther of the average school in the norms population. 
po т by equation 3, the unbiased estimate of the 
dies ation mean will be obtained by computing the 
апе of the students in the school selected 
hand en dividing this mean by 21. On the other 
Sampl if the school selected at random for the 
Bias ot happens by chance to be about twice the 
then ü the average school in the norms population, 
ling Bs unbiased estimate is obtained by doub- 
mate € mean score for the school! Such an esti- 
now. might be a reasonable one if there were 
School ни be a high negative correlation between 
àn un Size and school achievement; it is clearly 
Rice one if there is little or no corre- 
assumed Penn school size and achievement, a5 15 
i ere, І 
ea the large standard error of the unbiased 
e of ate is clearly intolerable, this estimate n 
Cause t further interest to us here. This 5 E 
Dess j he reader no uneasiness, Since ro d 
iased. merely an arbitrary requirement. . ыа 
Чеш Statistic is defined as one whose arit Be 
Sor Over all samples is equal to the co xe 
is th ing population mean. The definition of bi д 
"S based оп a choice of the arithmetic men 
rather appropriate measure of central tendency, 
Media than of some other measure such а5 e 
itrary Or weighted average. This choice a т 
rom 5 One, since the arithmetic mean X e 
dene eing the “best” measure of centr 
Yfor e very kind of situation. 


Two-Stage Sampling 
уре fact'that the sampling error in the чё 
norms sample depends primarily on d 
ber = of schools tested rather than on th n 
and 2 „Students tested, as seen in equations i 
be redu: Suggests that the sampling errorš " гей 
dents "Ced without inc reasing the number "E bë 
maqe а, provided arrangements баще 
ber sep test more schools, with fewer Sn 
chool, The practical problems involve s 
Boing o Procedure will be briefly discussed befor 
Altho to the statistical considerations: — hool 
Princi ugh it might be possible for many 5 is 
from Pals to segregate a small number of at 
€ group to which they belong in orde 


S. 


administer one or more tests to them for norma- 
tive purposes, such a procedure would probably 
not be advisable at the lower educational levels 
since the fact of selecting and isolating these stu- 
dents from their customary group might cause 
them to give atypical test performances. Such stu- 
dents might, for example, be unusually nervous 
or uncooperative during the testing period. 

Even if the segregation of small numbers of stu- 
dents at the primary or secondary educational lev- 
els is ruled out, the objective of testing more 
schools, with fewer students per school, can still 
be achieved in certain cases where two or more 
tests can be normed simultaneously. If it is pos- 
sible for the tests to be all administered simultan- 
eously in the same examination room, then, if 
there are k tests, each test need be administered 
to a subsample of only 1/k of the students in each 


school. 


A rather different situation pertains at the col- 


lege level. Here it is usually im possible to test 
all students in any one college simultaneously; 
furthermore, it is usually equally impossible, be- 
cause of conflicts in the students’ schedules, to ob- 
tain any sizable random subsample of students who 
can be tested at any one session. The only way a 
really random sample of college students can be 
obtained, therefore, is to work with very small 
subsamples (perhaps from 2 to 10 students) in each 


college. "m | 
At the lower educational levels it is convenient 
bsample to be proportional to 


for the size of the su ) be pri 
the size of the school from which it is drawn. At 
the college level such a procedure might produce 


excessively large subsa m ples in the larger col- 
leges; hence it may be most conveni ent for the 
number of students subsampled from each college 
to be the same irrespective of the size of the col- 
lege. These two situados will be treated sepa- 
i ]lowing sections. 
rately in pin Dossible to fix the size of the 
subsample in some way S9 that it is neither equal 
for all schools nor proportional to the size of the 
роб. Á general discussion of the almost end- 
P array of possible sampling and estimating 
rocedures will be found in texts on techniques 
and theory of sampling [e.g 1; 3]. Most of the 
ialized formulas given below can be derived 
spec ine very general formulas given in such texts, 
а rocess of simplifying the textbook formu- 
E poil translating them in terms of obs erv- 
las = tistics convenient for the present problem 
able 5 at cases a more formidable task than de- 
160 "a desired specialized [formulas ab initio, 
ape вй cases will be done here. 
€ xcellent, very readableand relevant discus- 
" et eneral sampling principles is given in [2]. 
ui Paket describes an actual application of 


dures for obtaining na- 
Wo Т norms for 15-year-olds on a reading test. 
tion: 
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n 
i Ушу; 1 В. = (7) 
t = po Ут 
Size = subsample Proportional to Var,¥ = Var, mW = wr Var; iyi, 
Size 0: 


ious sections, there has been no sub- 
Hee each School, and the same sym- 
bol, Mi, has denoted the ‘іле of school i and 
the number of students sampled from school i, 
since these have been identical quantities. Here- 
after, mj will be usedto denote the number о} 
dents in the subsample dr, 


Mj will denote the total num 
group in school i from whic 
students is drawn. 


the values of mj being fixed. Since the n values 
of yj vary independently, 


= n 2 V; ү; (8) 
Varzy = =m? areyj . 


n?m2 


Now Үаггуі is simply the usual sampling гане 

of the sample mean obtained when mi теми d 
ider the case where the size drawn without replacement from a finite pop 

is to be propor- tion of Mj cases: 

of the school, i.e., where mj 

is proportional to Mj. Here, as before, it is nat- 
ural to use the arithmetic mean of the scores of 

Students in the norms 


=. _ Mj -mi 
Vareyi = “aM, Sf 
Sample as an estimate of 
the Corresponding me 


2— sg, (9) 
t given for %' in equation 1 mi i 
mple mean, Yi, has been sub- 


School mean” Yu 


Where S; is Mi/(M 


i - 1) times the standard we 
= d n n ü tion of the scores of the Students in cules M e. 
Y = (Mi 7i)/2M; = (Zmiyj/Zm; . (5) f = mi/Mi is the (constant) subsampling £r 
Thus, 
As before also, this estimate is biased; however = l-fh 2 (10) 
this does not Seem to be sufficient reason to dis- Varzy = nme 2miSi 
Courage its use, 

Since y results from a two-stage Sampling pro- à 5 es 
cedure, its Sampling variance is a resultant of Assuming either (i) that the variance of sc ОТОП) 
Sampling fluctuations occurring at each of the two Within a school is, to an adequate c ge with- 
Stages. [ The reader Who wishes to Skip the deri- unassociated with size of School, or (ii) that 
vation should Skip to equation 13 and refer to the 
appendix for the i 


; me 
in-school variance is approximately the sa 
from school to School, it follows that 


Pressed 


9 components by means of a very 


= - 11) 
Varay = 1— aga) ( 


Vary = Е, (үаг,5) 4 Var, (E25) , 


ue 
(6) where a(S?) = 28?/n is the arithmetic mean re 
of the within-school variances (Si) of the sc 
over ay denotes the average nq expected” value | Í" pen find the expected value of the 
Over all possible Samples, and he Subscripts 1 quantity in (11) over all possible s amples 
and 2 refer to the stage of Sampling, — Schools. Using again the assumption that the w^ 
Tor any given set of п Schools, gy is the ex in-school variance either is unrelated to “oo 
pected or average value of ӯ over all possible Sub - size or is the same inall schools, it is found t 
Samples within the Biven set of Schools Since approximatel ; 
the values of mi are fixed Once the Schools are ш; 2) 
chosen, itis obvious that Bay = Fr. Thus the E,(Var,3). l-£ A(S2) (1 
Second term іп (6) is the Same as the Samplin, EES = M Pt 
variance given by (2). 8 
In order to Set the first ter 


Gmj/n , so that where A(S?) is th 


the 
€ arithmetic mean value of 
Within-school va. 


ion: 
riances for the norms populati 


— 
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Thus from (6), (12), and (2'"'), 


Var, y = 1 - f 2 1 2 E 
27 = r ASD +20 + CM)S Y). Q3 


Iti | 
Bieten that if the subsampling fractionf 
stage PL in effect, there is no second- 
of (13) v i Ing), the first term on the right side 
of m нА 13 shows how the sampling variance 
the Guan be of the norms sample depends both on 
Proportio er of schools in the sample and on the 
numeric: E students sampled from each school. 
economie _ кмс will help to make clear the 
Creasing erae may be achieved by slightly in- 
Ple while dr number of schools in the norms sam- 
dents t rastically reducing the number of stu- 
In аен іп each school. 
Sixth Pie qm example already given for the 
80, since е, СМ = .55, S(Yi) = 4, and M = 58. Al- 
Must be Sy = 10 and S(Yi) - 4, the value of A(SÎ) 
iven ee nately equal to 5ў - s? (Yi) = 84. 
e I sho Se numerical values, the top half of Ta- 
(t) the 10%5 for different subsampling proportions 
edin ^: m ber of schools (n) that must be includ- 
ean slut OBS sample in order that the sample 
at would have the same size of standard error 
of iin, if the usual cluster-sample meth- 
represent ing were used with n = 36 schools, aS 
9-last с = in the first line of the table. The next- 
Dumber a umn of the table shows the expected 
gree of examinees (nfM) required to obtain this 
ratio of t accuracy. The last column shows the 
pro he number of examinees required under 
num, PoSed two-stage sampling procedure to the 
са examinees required by the usual clus- 
Brade [^o method. The last line of the sixth- 
rent of д8 Shows, for example, that if only 2 per- 
tot at nu estudents are tested in each school, the 
Cent „ Mber of students tested need be only 9 per- 
tee tin Many as would be required by the uS" 
“Sted in Procedures under which all students are 
es any given school. | 
h ation fe half of Table I gives similar infor- 
here, Qr the tenth-grade numeric al example; 
E аз hd = .84 and M = 108, the other values be- 
E Е а The last column of ће table shows 
атры Т economies resulting from two-stage 


Ару; 

Plica : 

Хане таа of Standard Error Formulas to per- 
anks in the Norms Table 


rapas alreaq 1 vie 
За dom san Pointed out, in the case of simp 
n Ampling from a normal distribution the 


Pli 

е n : put _ 

nij, 8 Variances of the various quantiles (per 
s to the 


S : 
suing vals) have certain stated ratio z 
S for ,ariance of the mean. No SERRE ев 


€ corresponding sampling vari 
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in two-stage sampling are prese rai 
the writer. Although the Pu ti a eet 
sampling variance of the median in the ай, le 
normal case probably provides a useful unde 
mation for many or mosí practical situations, an 
similar generalization for the more extreme aa 
centile points would be of doubtful value. 

Actually, the most commonuse of norms is not 
to determine the score having a given percentile 
rank, but rather to determine the percentile rank 
of a given score. Fortunately any formula for the 
standard error of the mean of the norms sample 
can, by a simple change in the meaning of certain 
symbols, be used to represent the standard error 
of any percentile rank determined from the norms 
table. pt 

Let x denote any test score, and let P(x), or 
simply P, denote the proportion of cases in the 
norms population lying at or below a score of x. 
Consider now a set of data, paralleling the test- 
score data, in which every score at or below x has 
been replaced by 1 and every score above x has 
been replaced by 0. P is the arithmetic mean of 
all these 0’s and 1’s for the whole norms popula- 
tion. If the proportion of cases lying at or below 
score x in a norms sample is denoted by p(x), or 
simply by p, this sample estimate of P is simply 
the arithmetic mean of the O'sand1'sinthe norms 
sample. Thus any formulas for the standard er- 
ror of the mean of a norms sample can be used 
forthe standard error of any percentile rank 
simply by replacing y with p and Y with P. 

Equation 13, for example, remains unchanged 
except that y is replaced with p and S*(Yi) is re- 
placed by S*(Pj), where Pj is the proportion of 
cases in school i lying at or below x and S?(Pi) is 
the variance of these proportions. The quantity 
Sf, the within-school score variance, must now be 
computed from the 0’s and 1'ѕіп school i; the com- 
putation may be 5 im plified by using the familiar 

101 - Pi). With these modifica- 


2 
ready to be applied for deter- 


formula Sj = 
tions, equation 13 is 
ercentile rank in 


mining the standard error of a р 
a sample norms table: 


(14) 


It is obvious that the more the schools in any 
particular norms population differ from each 
other in the values of Pi the more the advantage 
to be gained by subsampling within schools, Some 
idea of the size of the relevant standard errors 
can be obtained from a numerical example based 


actual norms data. 
ux of Table H was computed from 


The first line 
the test scores of all tenth-grade students in a 35- 
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TABLE П 


CENTILE RANK OF SCORES NEARSELECTED 


THE STANDARD ERROR OF THE PER 
TS FOR VARIOUS SUBSAMPLING RATIOS 


POPULATION PERCENTILE POIN 


Standard Error of Percentile Rank 
Near Population 


Sub | Number of 
am ine schools in 50th 75th ` 90th 
io (f) sample (n) percentile percentile percentile 
E: 35 . 041 .030 .015 
.5 70 ‚029 . 022 .011 
mi 350 .015 .011 . 006 
007 005 


.01 3500 T , . 
imple cluster sampling. 


жт ; 
his row represents the usual type of s 


TABLE HI 
MEAN OBTAINED BY THREE 


A COMPARISON OF THE STANDARD ERROR OF THE 
METHODS OF SAMPLING COLLEGE FRESHMEN (n = 100) 
Expected Number of Standard Error of Mean 
n à 
umber of Subsampling examinee Equations: 
i o 
per scho " — 


53,100 1. 00 531 
26,550 50 266 ‚66 ‚61 .40 
13,275 pr 133 ‚66 “6% .41 
5,310 ‚10 53 . 67 .69 .42 
2,655 m 21 . 68 ‚12 44 
1,062 .02 11 a i ие 
^ 531 "m 5 Mud .93 ‚56 
* In the case where the size of subsample is proportia to school u P this "en 
achievement and to within-school vari- 


Eives 
** ii the average number of examine 
ance ning School size to be unrelated t° 


chool 
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norms sample. Equation 2" was used 
Е апа Р substituted for у and Y, and with 
sample statistics substituted for the correspond - 
ing unknown population values. In this particular 
set of data, the total number of students in the 
sample was 3748, the average school size Was 
107, the coefficient of variation for school size 
was .88, and the correlation between school size 
and school mean score was 0.18. 

It should be noted that Table II gives the stand- 
ard errors of percentages, not of percentile points. 
The standard error ofthe mean ofa norms distri - 
bution is expressed in test-score units, the stand- 
ard errors in Table П are not. The table states, 
for example, that in simple cluster sampling the 
proportion of cases in the norms sample lying be- 
low the population median will fluctuate from sam- 
ple to sample with a Standard deviation of .041; 
i.e., the population m edian score would have an 
Observed percentile rank between 45.9 and 54.1 


in about two-thirds of all similar norms samples 
that might be drawn. 


All four lines of 


Table II refer to sampling 
from the same nor 


ms population. 


error that, with minor 
same at all percentile 


T 


Wo-Stage Sampling With Size of 
Subsample Fixed 
— P e fixed 


Up to now, the mai 
with the situation wher 
(mj) is proportionate t 
(Mi). i 
larger schools a weight 
As already pointed out 
pecially in testing coll 
poses, it may be admini 
ble to obtain truly ra 
sample size must be 
college. It is helpf 
case where the size 
for all subsamples. 


S if the sub- 
Proportional to size of the 


ul, therefore, to consider the 
of the subsample is the same 
The more general case where 
the size of subsample may be arbitrarily varied 
from college to college will not be treated here, 
The interested reader can find the neces 
basic theory in appropriate texts [e.g., 1 
Two possible ways of gi 
the larger schools Sugges 
will be treated separately. 


The necessary notation is the same as before 


ving proper weight to 
t themselves. 


, Ch.11]. 


Т 
except that mi is constant апа the subscript i wil 
consequently be dropped from this symbol. 


Weights Applied Computationally After 
Collection of Data 


When the schools inthe sample have been drawn 
at random, the weighted mean 


= ZMiyi (15) 
P n 


= 
— ——Á Ws с... 


provides an acceptable estimate of the mean ар = 
іп the norms population. The formula for yr : 
the same as theformulafory given previously; si | 
the present case, since size of subsample is vost | 
proportionate to size of school, equation 15 Ed 
Sents a weighted rather than an unweighted et 
of the scores of the students in the sample. Bapa 
tion 15 assigns appropriate weights to the ignem 
values of yj according tothe size of the school tha 
each mean represents, he 
These weights (Mj) are not, in general, t 
Weights that will minimize the sampling per 8 
of the estimate. It may happen, in fact, that ан 
unweighted mean of the y; willhave a smaller c 
pling variance than will the weighted mean 1s 
Both the weighted and the unweighted means of t B 
yj аге, in general, biased. The unweighted md 
however, has an additional disadvantage so al 
ious as to prevent its use for many or most ie 
poses and to rule it out of further consideratio 
here. The unweighted mean is not only a put 
estimate, it is an inconsistent estimate; i. €., пе 
value does not, in general, approach that of t = 
population mean when the sample is made wn 
(An exception would arise if it were known he 
certainty that school size and school achieveme 
were unrelated in the population, since їй ": 
case there would be no advantage in weighting ^... 
Subsample mean according to school size. bm 16 
Sumption that school achievement is unrelate con- 
School size is often adequate for providing d a 
venient approximation to the sampling error ге- 
Sample estimate, Тһе order of approximation n 
quired for such a purpose is quite different ÍT as“ 
that required for published test norms, 50 thi (0 
Sumption cannot in Beneral be brought forwar 
Justify the use of the unweighted mean here.) ula 
It remains to Obtain a standard error ers 
for yp , which Can be found, very much as bef T 
from equation 6, Again, it is apparent that E29}, 


-Y'. The equatio i 10) is r€ 
ily found to E n corresponding to ( 


= 16) 
Var;yp = =b n ч ‚5° ( 


n : the 
Where a (Mi) = (ZMi)/nis the average size of 


З. mt 
Е 


k 
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schools i 
in the sample. Equation 16 simplifies to 


Varan = 1 
зӯр = nn (Si) (+c) уру] (17) 
m 1 , 


where cw = 

анов ot Ме а Мол) is the coefficient of vari- 
Finally, i computed for the sample. 

Ji. 

mn 


у = 
агур = А(8{)[1+См-=] 
M 


_ (18 
S*(Y)(«CM › | 


+ 


sie 


*pproximately. 
sc 
hools do not vary much in size, 


Var syn = 1 s= 
ryp = = (1 - £) A(S?) +1 52 (Үү). (18') 


Sinc 
€ there і ; ү 
ere is no difference whatever between ур 


yw 
check hen all schools are of the same size, ? 
and (18') 


аге е 
q 
the sc that when 


er than 
chance on for afixed total number 0 
choosing er peine ent method is inferior to 
f School wh of s ub sample proportional to size 
"sible, enever the latter is administratively 
he f ‚ 
the Problen Din results can be extended beyond 
1 ms popul of estimating the mean score of the 
E the НЫ ation to cover all the percentile ranks 
wi the ti table. Just as the unwei 
ke an ers situation does not in gen 
ot n, Sa eptable estimate of the mear of the pop- 
the айон а simple frequency distribution 
таб date ësti in the norms sample fail to provide 
ev ks, Ir аав of the population ре rcentile 
nan. in lar, e norms table is to avoid being biased 
бә] impr TEES, MIB results from © 
for School be weighted according to ће size of 
Coy orms p In preparing a frequency istribution 
he nting ea this can be achi 
ед Were act ch student in thenorms 
ia ation pues d Mi students. When 
Samp ОЁ the ay be used to give the samp!ini 
ing 2-е norm percentile ranks appear ins іп the 
th s table, the method for doing this be- 
evious 


Sect; e sam 
tion, e as that discussed in the PT 


Sam 
Plin 
Ë wi i 
ith Probability Proportional t° size 


all When tlie «i 
ер «c 00 "М к of the subsample is constant for 
ho » extra weight may be given О m 
m the 


Ols b : 
y weighting the data fr? 


eral pro- 


hieve 


sample as if 
is done, 


schools as part of the computation: 
; ; : Б i 
E c cM IM An nem seso p 
зай E esired weights into the samplin, ten 
е N inating any necessity for bete “i 
g of the sample data after it h s 
heresy as been col- 
All sampling methods consi 
point start by drawing a dE res toe а EE 
Schools from the population of schools, each Las s 
yen an equal opportunity toappear in the iem 
y be noted incidentally that not only d: the 
schools have an equal opportunity to оре in 
norms sample, but also, in the case of ке E 
cluster sampling and in thecase of two-stage n 
pling with size of subsample proportional pm sma 
of school, each student in the norms шаба 


has ап equal probability of being included in the 


norms sample. Under the procedure to i 
ered in the pres ent subsection, the id id ier 
schools is SO carried out that the probability of 
any school appearing in the norms sample is pro- 
portional to the size of the school; once the sample 
of schools has been selected in this way, the sub- 
sampling process iscarriedoutas in the previous 
section, m students being selected at random from 
each school. The method of the immediately pre- 
ceding subsection will be referred to as the meth- 
od of weighted averages the method of the present 
subsection will be referred to as the method of 
weighted sampling. 
In the present method, weights are given to the 
schools during the sampling process; hence an ob- 
vious sample statistic to use to estimate the mean 
score of the no rms population is the unweighted 


sample mean. 


in. (19) 


d of weighted sampling, 


th the metho 
mate of the population 


When used wi 
biased eS ti 


yr gives an un 


mean Score. 
It remains to determine the sampling variance 
of this statistic. As a first step, it is helpful to 
transform the problem into one involving only sim- 
ple random sampling. This is achieved by noting 
lying the present pps (proba- 


that the effect of apP 
o size) S am pling method to a 


given population 15 effectively the same as if an or- 
dinary random sampling process had been applied 
toa population differing from the given population 
in specified ways. specifically, if schools 

; occur with frequency fiin a given 
a pps sample of schools drawn 

is effectively the same as if 
imple random sampling 
hich schools of size Mj oc- 


to obtain the sam 
ary to apply eq 
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i d variances from 
opulation means an 
umi population rather than from the ac- 
tual norms population. 
It is quickly found that 


where the ‘‘stars’’ indicate that the population sta- 
tistics arecomputed from the modified population 
described in tbe preceding Paragraph. Formulas 


for А(5?) and 52(Ү;) in terms of the unm odified 
population values are: 


* 
*. а L N в 1 N 
A(S1) = Ki Si "ST Misi (21) 
* = IN . LN _ 
S0) = qM 2MiYI- (мтр . (22) 


does the present method ( 


ed averages, as given in equ 
ly greater than that of th 
presént method; if the 
small, (18) is greater 
ly (1 + Cp). 


ch is based on college- 
freshmen data with n= 100, M = 
S?(Yj) = 16, A(St) = i 


ast under 
the table, 
Í exami- 


Although the foregoing provides a good basis 
for preferring the present method to the other 
methods, thesituationis far from clear cut, The 
modified distribution of schools usually differs 
very drastically from the actual dist. ibution of 
schools. The values of (S?) and of $? (Yi) for 
the modified distribution are determined primar- 


ily by the large schools; if the large pep € 
to have higher within-school variances. esi 
tend to differ more from each other in dh бе 
achievementthan do the smaller schools, a that 
pling variance of (20) may becomelarger d ie 
of (13) or (18). The computational x roduc 
sampling probabilities that will to gether (pe 
sample estimates with mininium standard c ië 
have been determined theoretically, but thes aithi- 
oretical results cannot be applied in рете 
out detailed а ргіогі knowledge of the popula 
о be sampled. К 
| Тһеге i. a practical difficulty that may M si 
during the sampling process in the ve pop- 
weighted sampling whenever the schools in th iue 
ulation differ widely in size. Suppose, as rne 
Well be the case, that some schools are 100 ti em 
as large as are other schools that occur eem 
frequently in the population. A pps sample pros 
then contain about 100 times as many of the Tu 
er as of the latter schools. The total norms ipf 
lation, however, might contain only one or tw 
the larger schools. А 
In Susa а case, pps sampling is possible үш 
the first-stage sampling is done with ropa T, 
This procedure is quite satisfactory чено i 
but it is not always admini Stratively pod 
test a large number of Students in a single sc 
as has been ointed out reviously. t 
When the eee т a with mt 
So that a single school may be drawn several al 
a new subsample must be taken from the sc will 
each time the school is drawn. E quation gl ibe 
Still apply, providing second and subsequent tba 
samples are allowed to overlap the first, SO güb- 
the same student шау appear in more than one D 
sample. It would be more efficient, ке а 
Carry out ће Subsampling without ms mire gi 
Students, in Which case the sampling error W 
be less than that given by (20). mise 
It is obviously possible to make a compro thé 
between the method of weighted sam pling an сап 
method of weighted averages. The weighting s is 
be done during the sampling process insofar arnt 
convenient. Further Weighting can then be 9 mer- 
ed computationally by applying appropriate nu 


š A d Sam- 
ical weights to the Sample data collected. 
Pling variances 


ex- 

ol 
mate of the size of each scho 
estimates may be used vedure 
The effect of this proce can 
or of the resulting aus an Я 
by formulas given in refere 


School-Mean Norms 
—— Mean Norms 


test Publis 
Or the use of 


these 


pjate 
hers provide no approp? 


Most ter" 
e in 
ME administrators who are 


- 
— —— __ 
mee F 
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сЕ mean score of a school rather than in 
TOF him Sa 15и individual. Norms appropriate 
tion of suh d represent the frequency distribu- 
Th ool means. 
Berat ores sampling proc edures recom- 
vantage i the preceding sections have the disad- 
exact me at the data obtained do not provide the 
sample rd score for any school, but only for a 
vantage the эїшїнпүн Їп fat school. This disad- 
sated [om least in theory, is more than compen- 
represented the fact that many more schools are 
convenient in the sample data obtained. Some 
Structing opes is needed, however, for recon- 
ple data, e school-mean norms from the sam- 
2 
mean i the actual variance of the n sample 
ing si a sample estimate of the correspond- 
ror is шә error S. E. (у). This standard er- 
tuati ade up of two independent sources of fluc- 
9ns (cf. eq. 6): 


S.E.2 S N 
E G) =s (p+ E 28. BiG) » C9 
N i=l 
Whe = 
mean, S. E.? (yi), the sampling variance for the 
ean of th 1 ec 
e usu e Subsample drawn f rom school i, 1$ 
lation: al variance in sampling from a finite popu- 
М; - mj 
sete Mic gw. (24) 
iyi) miM; Si 


is obt ained from 


As = 
(23) ample estimate of S° (Yi) a 
ima 


a 
the А (24) by substituting sample est 
er two variances: 


Mi - mi 2, (25) 


Sy. 
(Y) = s*(yy - ib 


lc 25 is an obvious extension of the йаша 
е and methods of variance-components аз 
In general, equation 25 should eed 
ean dpp тва of the true variance of the scho 

bY the ti ribution than would have bee =o 

Schools ual methods, since these use ch fe 

If the , 
too e estimated variance ô? (Yi) does not m" 
ten be ag from the sample value s?(yi; ! арй 

Tibutig, sumed that the shape of the frequency 
much f n of school means would not differ too 
Es y the shape of Thapa ene y ШЕРЕП 
Теве means except for a change in ale veo 
агу, the 00 the change in va riance. TO i 
Ped eoul logic by which equation 25 ag СЕ 
of ighes be extended to provide estiu ae 
qi School , moments of the frequency m cy 
Шыр, means and the shape of thi frequency 
lon could be determined fro 


Equatio 
Tationa 

YSis, 
a better 


- as bas 


mated moments. If only the variance of the school- 
mean distribution is to be estimated, no more than 
two students need be tested in each school; larger 
samples from each sc hool would be necessary if 
the higher moments were required. 


Stratified Sampling 


In stratified sampling, each stratum is repre- 
sented in the sample by a pr edetermined number 
of cases. This distinguishes it from cluster sam- 
pling, since the presence or absence of any clus- 
ter in the sample is a matter of chance. 

Up to this point, no mention has been made of 
the unquestionable desirability of stratifying the 
schools in the norms population before the sample 
of schools is drawn. Any of the methods of sam- 
pling already discussed can then be applied sepa- 
rately to each of the resulting strata. 

In the case of a nationwide norms population, it 
will almost surely be impo rtant to set up two or 
more strata based upon geographical location. Ad- 
ditional strata at the college level might be (i) jun- 
ior colleges and senior colleges; (ii) male, female, 
and coeducational colleges; (iii) liberal arts col- 
leges, teachers colleges, etc.; (iv) public, private, 
and church institutions. Ideally all of these dimen- 
sions should be taken into account simultaneously; 
for example, one stratum should consist of south- 
ern junior coeducational private liberal arts col- 


Me us foregoing are plausible dimensions for use 
es for stratification because each seems 
o be related to test соге. On the other 
° be a useful basis for strati- 


allyunrelated to test 


likely t i 
hand, school size may 


ication even when it is tot 
a This last is suggested by the fact that the 


dard errors represented by formulas 2, 13, 
= educed if См, the coefficient 
of variati 1 size, were equal to zero. 

If schools Wi ly stratifiedon size, it 


i i f the last three 
uld make no difference which of | ast 
Sampling methods discussed is applied within each 
stratum, since y, YT» and yy are identical when- 
ever all schools are the same size (y' is a special 
= occurring when the subsampling frac- 
Furthermore, 


the sample mean score 
within stratum h (Sh: say) would be an unbiased es- 


i f the population mean scorefor stratum h. 
ашо © Ç estimate of Y, the general pop- 


biased 
would be given by the sample statis- 


a B MgNhYh 
B. aln 


А 26 
ENS, (26) 


ch school in stratum h 


is the size of ea 
hools in stratum h. 


M 
ket" ie the number of sc 
ily seen that 
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= 1 2N? Th: 37 
Var Y= тууру PMnN,Varnyn . (27) 
h 


i hools in 
nly a small proportion (gp) of the sc 3 
Ew Miri are usedinthesample, then Varhyn 
can be obtained from (13), (18), or (20) by setting 
all Mi equal and thus setting CM = 0, the result 
being 


1-fh 
ШИ 


РС 1 
Varhyh = An(SÎ) + np SY) » (28) 


where nh, fh, Ap, and Sh are the values of n, f, A, 


and S for stratum h. If Eh is not small, the appro- 
priate modification of (28) is 


l-fh 2, l-gh 
ММ SNS) + 


nh 


Varnyh = 


Sh(Yi). (29) 


The standard errors given by (28) and (29) are 
generallyless than those given by (13), (18), or 
(20) because now CM = 0. The standard error of 
the estimate obtained in stratified Sampling is:seen 
by (27) to be a weighted a v er 
errors relating to the Separa 
by (28) and (29), 
Standard error of 


the case that there are Several 


aS geographical region and type 
of support (public or Private), on which the popu- 
lation should be Stratified, because these dimen- 
Sions are correlated with test Score, Si 


Optimum Sampling Procedures 
Formulas are availapi 


population. In addition, t 
tailed information (or estimates) 
ic costs characterizin, 


require de- 
as to the econom - 


dents tested. 


The possibility and theoretical desirantity S. 
multistage sampling procedures will also pobre. 
gested itself to the reader. The School Sy heal 
the single school, the single class within a penon 
all represent possible stages in a multistage 


pling procedure. 


The interested reader is referred to шна 
texts on sampling methods foríurther details. cad 
methods discussed in previous sections of the cred 
ent article will ser ve admirably, however, an enl 
ever the detailed information necessary to ea fsü- 
mine optimum procedures is lacking, as wi 
quently be the case. 


Summary of Notation 
пагу ot Notation 


A(SE)-Est/N 
a(S?)=28? /n 
a(Mi)-($Mj)/n 
CM-SM/M 
€M7S(Mi)/a(Mi) 
f=mj/Mj 


SM 


2 А 
Arithmetic mean of Sj for 
norms population. " 
Arithmetic mean of Sj for 
sample. š 
Average size of the schools in 
the sample. m of 
Coefficient of. variati T рй 
School size innorms deci 
Sample coefficient of varia 
for size of school. in 
Proportion of students testa, я 
eachschool when proportion 
Subsampling is used. — 
Value of mj when thesame gae 
of sample is drawn within e 
School. | 5 
Mean size of school in norm 
Population. | ol 
Average size of within-scho 
sample. єў 
“Size of school i'—the T€ 
of students inthe ith first-s 
Sampling unit. i. 
Size of sample within geh : 
Number of schools in no 
population. | ms 
Number of schools in nor 
sample. ool 
Correlation between € in 
mean score and school siz 
norms population. ts’ 
Standard deviation of ae е 
Scores in school i multip 
by Mj/(Mj-1). ts’ 
Standard deviation of aa 
Scores in subsample Ji 
School i multiplied by 
(mi-1). ol 
Standard deviation of scho 
Size in norms population. indi- 
dard deviation of p 
Vidual scores in norms P 
tion, 


LORD 268 


S(Yi) Standard deviation of school 
s(y) mean score in norms population. 
Observed standard deviation of 
subsample means multiplied by 

Y n/(n-1). 

ў Mean score innorms population. 
Mean score of norms sample ob- 
tained by two- stage sampling 
with proportionate subsam- 


Ӯ, pling (5). 
ў, Mean score of school i. 
* Mean score in sample from 
y School i. 
Mean score innorms sample ob- 
tained by simple cluster sam- 


pling (1). 

Unweighted mean of norms sam- 

м ple when size of subsample is 

Е _ n fixed. 

"n ZMiy,/ZM; А weighted mean score of norms 
sample when size of subsample 

y" is fixed. 

Unbiased estimate of j 

of the norms population in sim- 

ple cluster sampling (3). 


the mean 


Note: 
Kas One oí the purposes of the present article 


as b 

Pressed it? obtain standard error formulas ех” 

Can be nat terms of quantities whose magnitudes 
estimated in advance of actual testing: 


Whenever actual norms data are available, the 
formulas in standard texts will usually be prefer- 
able, since these usually require fe wer assump- 


tions. 
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THE ROLE OF SELF-CONCEPT 
IN ACHIEVEMENT 


ROBERT M. ROTH 
Hampton Institute 
Hampton, Virginia 


THE PUR 
bs relationship bet of this study was to investigate 
iid For do ledig v self-concept and achieve- 
px Асһ1еуета - of practicability, the vehicle 
me Vement on a nt in this study was reading im- 
"enl On this Ht ере level. Reading improve- 
yai lon has а as а subject for scientific in- 
eee of the liter en replete with much speculation. 
fios eh in cei inthis area tends to report 
(1 "RÀ and the Pe of the cause of the reading dif- 
ing ¢ 6, 17) nk involved in improvement 
to`, 2 Create the € work has been done in attempt- 
These ^ Stan d retical frameworks with which 
Succ ° theories reading improvement (10,14). 
Ssstully n however, have not been teste 
tion m theory а experimental setting. 
ed < е theor Sted in this study has its orienta- 
the о Rogers (13) of self most succinctly integrat- 
a ), and can be characterized by 


em 
ent by Lecky (7:153): 


, Any 

ia T WAT 

tenon of нн this system (of organ- 

Self With the i uation) which is inconsis- 

reg; Annot b ndi viduals valuation of him- 

e pitance hey assimilated: it meets Wi 
*reanizati dis likely, unless the gener 

he lon occurs, to be rejected. 


T 
Ory 
and 
50 апа Propositions 
а situ- 


Ation yp. Stud | 
Which p м, subject finds himself in 
S sures him to change (a reading 


inc. O Ve 
Cre "ment prose 5 
gram іп which reading 


а: 
Conta ^ Sn films 0 


a gs 
ins comp arepresented in each successive 
` This e of change requiredis one from 
al Бы ition produces a force upon 
How ope Which he is expected to do some- 
he n he is to this experience will de- 
E np, P Pa VAR this demand. 
“pt, is emand as a threat, h€ defends 
ç threat 3 uc ЖЕ" n- 
ani Chan, P do and maintains hh self co 
im inge Cien not see this as a threat he 
p 1 Self concept commens ate with 
e former 


e 
ае rs 


udi 

ар Toad 

exta c pes experience. ped 

Der егеу Less a P6 жараса to result in insig 
ig, ећ change than the latter. gos 


Op Ence , Against thi 

Ne may” Such a this threat by distorting UE 

bh» da Ta integrate very little 
€ experience by leavin” e sit- 


uation. The proposition which was tested in this 
study was that there will be significant differences 
among the self perceptions of the three groups in 

ir general defensiveness, Self as a 
Authority, Self as a Stu- 
as a Reader conception su ch that 
s will appear in the following or- 
defensive to least defensive: Attri - 


and Improver. 


dent, and Self 
the three group 
der from most 
tion, Non-improver, 


Method 


The Sample 
The subjects were drawn from three reading 

t classes at the University of Texas. 

t offered for credit and 


These classe 
voluntary on the part of the student. 
that the student read 


The only ге 
initially at the minimum rate of 250 words per 
minute d have 4 comprehension of 75 percent on 
the DR This approach tended to keep the group 
more homo geneous in terms of initial reading 
achievement Other kinds of classes were provid- 
ed for those o did not meet these requirements. 
e groups W were studied were those who did 
ass the initial г quirements. These classes me 
fora total of 14 ne-hour sessions; during whic 
c reasingly peeded reading films were shown. 
These were followed by reading selections from a 
ding u ter reading the film the stu- 
ger ered questions based on the contents of 
dents ka ad . This was done after they had 
Lee y lect in the шат . poth film an 
r ged in series SO that the student 
manual od ti ad faster each time The stu- 
as red" c the speed of the film and the 
dent кер ге hieved, and the speed and 
compre? er ere reading in the manual. 
omprehens allowed each hour for group dis- 
2 n art te students” reading habits. 
ussi 
The Groups 
— roup consisted of 50 to 60 per 
althoug! th Shely in age, academic status, 
class ran ^ Freshman males and fe- 
gcholast ie pecause of the greater probabil- 
males 
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ity. A sample of 54 was used, con- 
x 7 зл апа 45 male subjects. е 
— bjects' scores on the Diagnostic Read- 
i ол апа aíter the program were con- 
is d into equivalent scores with use of the 
eek of Equivalent Scores (11) which weight 
exer and comprehension when considered togeth- 


er. The weightings are such that comprehension 
is more affective than spe 


but the comprehension ge 
for the higher weights. 

test were then conv 
posttest. A correl 


which was not sign 
Inasmuch as the s 


+ The Attrition 
group contained tho jects who sta rted the 
Program but who discontinued before the seventh 
ау point. 


The Procedure 
— sure 


1. The class met four times before the work 
on reading improvement began. 
A. During the fi 
ministered tation to the Course 
Was given, 
B. During the Second meeting the DRT test re- 
Sults were interpreted to the group and the 
SCT administered, 
С During the third meeting the Self Sort was 
administered, 
During the fourth me eting the Ideal Sort 
was administered, 


E. Arrangements were made for those Students 
who did not take the measures descr ibed 
above at the Prescribed time, 

2. The next 


3. When the 14 Meetings had terminated four 
more meetings Scheduled and proc edures 1A to 
1E were repeated 


The Instruments 
— ments 


The Q Sort— The Q methodology employed i 
the present research involved a ih 


n: p pete consisted of 80 self yee 
PELA selected from alarger univer а. ieqdeal 

ents which can be sorted ‘‘Self тат, ; the self as 
self " The “Self I am” is defined i sorting 

erceived by thesubjects at the time o а 
ind the “Ideal self” as the self the € € Q state- 
like to be. Originally, the em 13,10), 
ments stem from a number of sources ал Вар and 
and new items constructed by the exper 

i lleagues. | isionally 
Е ov — chose 80 items perc psy- 
Sorted as to dimension and Speer opido, d inten- 
chologists. There were two orien nsions: Self 
Sional and extensional; and four se self as a 
as a Self, Self in Relation to Authori pi Breeders 
Student, and Self as a Reader. These ed the place- 
were presented to judges who determin of the state- 
ment of each statement, Classification inions after 
ments were derived from the judges > Eight items 
the placement of the items. y: i . 02 level 
Were agreed upon as to placement at oe 
of confidence. Two items were rewritten. 


al- 
i the b 
The sample was structured according to 


jects were 
anced design shown in Table I. LT ents 
asked to sort the 80 Self reference s ta each stac 
with the following number of items in 1, “Least 
assigned a value on the continuum from 1, | 
like me, ” to 9, “Most like me. lues of Q item? 
Quasi-normal distribution of valu 


25,67 
x= , 
were designed such that x 400, x?- 2382, 
2000 


h- 
: is tec 
Sentence Completion Technique— pem in the ° 
nique was developed from the е from eac 
SOrt. These statements were selecte 


on- 
were € 

of the four dimensions of the sort. They 

Verted into stems to be completed. 


ена 
і і nitio o 
responses including defi o tw 


I 
Table P 


sulte 
contains rank ord 


+ men 
n erim e 
from Comparisons of Scoring by the exp th 


? mong 
and the two judges. The correlations а dthet 
experimenters Scori i 


judges Scoring the s 
Cant at the | 05 level 
tions between the twi 
Significance 


m 
co” 
. A F were ert 
Factors—The s tuden de ina 
pared on a measure or academic sene Exam. 
can ation Psychologic t measur 
tion ng achievemen 


ROTH 


TABLE I 


DESIGN OF THE Q-SAMPLE 


Areas of Self Reference 


Authority Student Reader Total 


Orientation Self 
10 40 


Extensional Items 10 10 10 
20 20 20 80 


TABLE II 


IONS AMONG SCT TOTAL SCORES AS- 
INTERCORRPE JUDGES AND THE EXPERIMENTER 


beer Aa m 
емони а = 


‚10* .64* 


Experimenter 
.55 


Judge 1 one 
‚05 level of confidence 


*Significant at the 
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rative English Test, vocabulary scores 
` Gos Reading Test, grade-point 
averages before and after the program, and a 
measure of reading effectiveness (E quivalent 
Scores) derived from DRT rate and comp rehen- 
sion measure. Equivalent Scores before and af- 
ter the program were used for categorizing Im- 
provers and Non-improvers. 


Results 
Inasmuch as the data in a study such as this 
ts will be present- 


1. Self as a Self— The Attrition and Non-im- 


redto be more concerned 
he Improvers, The Attrition 


er, and Improver, 

2. Self in Relation to Authority—In genera] 
there appeared to be no real difference among the 
groups in this area, however, may 
have meaning. Although there were simj 


while the Non-improvers showed 
fication with them (Tables III and IV) 

The groups were 
Ideal discrepancies 


y for the inten- 
Sional items, whilethose of the Non-improver were 


most discrepant. The £roups did not differ much 
with respect with discrepancies resulting from ex- 
tensional items (Table V). 


There was no clear pattern of defensiveness in 


3. Self as a Student— The Improvers seemed es 
be most concerned with this area. They were us 
most intensional and the most extensional. | es 
differences among groups within the extension : 
orientations were negligible. The Attr ition wa 
the next most concerned with the intensional qe 
The Non-improvers were the least concerned wi 
the intensional items (Tables III and IV). 

The Self-Ideal discrepancies were the sam S 
within the area for the Attrition and the er spe 
groups. They both had large discrepancies ae 
respect to the intensional items. s 
resulting from the responses with respect i 
extensional statements differed only slightly fr - 
group to group. The Non-improvers were eg = 
defensive іп this area. Both the Attrition and |: 
Improvers were more defensive than the Non-in 
provers (Table V). " 

In terms of defensiveness the three groups E 
peared to be in the following order from most 5 
fensive to least defensive: Improvers, Attritions, 
and Non-improvers. t 

4. Self as a Reader The Improvers were pese 
concerned with this area, the Non-improvers pee 
and the Attrition group of least concern. T ü 
groups followed the same order in identification 
With the intensional items, while identities 
with the extensional items was greatest with К 
improvers andleast with the Attrition group (Table 
III and IV). e 

Although the tota] Self-Ideal discrepancies og 
the area was not significant for all three asa 
the Non-improvers showed the greatest discrep E 
cies with both the intensional and extensional ey 
when orientations were Separately considered ( 
ble V). 

The Improvers were less defensive than ; пе 
Non-improvers. The Improvers high intension 


: e 
eted as an expression of its larg 
concern with the a 


all reader items 
ately. The Non. 
ally to the inten 
the items, This 
in the Self-Ide 


Я ГОУ Т 
аѕі defensive: Attrition, Non-impr 
er, anda Improver. 


5. General Defensiveness. Genera] overall e 
mea WONGSS was meas ust p) the three differe 
sont discrepancy derived from Self and Ide2. 
Sorting, Self-Ideal Correlations, and the qubd 
From al] the results indicated distinct a 
MSiveness, The Attrition group EO 
more general defensiveness than either of the t 
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1 
x TABLE III 


P ANA 
| LYSIS OF VARIANCE IN Q-VALUES FOR 36 SUBJECTS ON SELF SORTS BE - 
FORE INITIATION OF PROGRAM 


SEMEN iL LLL E 


Source of 
Sum of Mean 
Square F P 


df Squares 


Independent Variations 


i 
| Бы 
ples (S 
Persons nd 2 nil 
mples (SP) 
Sum 33 nil 
м _ NN s m, 
о orientations (O) 
Tientati 
Interaction (o) А 118.00 118.00 74.14 ‚01 (0/OSP) 
eviation (09) 2 25.19 12. 60 .30 ie 
n (OSP) 33 319. 51 9. 68 2.29 101 (OSP/R) 
Sum " 
35 1062. 70 
| ; Dimensions (D) 
| Inpe eñSion 76.38 LA 
s Я ç and 
DU. o mno Xo ОШ Ue 
ton ч 5.46 1. . 
Sum (Dsp) 99 540. 61 
108 1028. 04 
orientations (OD) 
Pract; 1.67 
i 1 А de 
on ( 91.12 n н 3.72 .01 (ODS/ODSP) 
. 1.16 nm 
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TABLE IV 


SUMMARY OF MEAN Q-VALUES FOR DIMENSIONS, ORIENTATIONS, AND 
SAMPLES FOR SELF SORTS BEFORE INITIATION OF PROGRAM 


Groups 


Self Authority Student Reader 


Dimension and Sample 


Improvers 5.21 4.57 5.10 5.13 
Non-improvers 5. 50 4. 64 4.81 5.05 
Attritions 5. 50 4. 61 4. 96 4. 92 
Total Means 5.40 4.61 4.96 5.03 
Intensional Orientation, Dimension, Sample 
Improvers 4.34 4.01 4.98 4.61 
Non-improvers 4.77 4.00 4.48 4.34 
Attritions 5.07 4.21 4.82 4.40 
Total Means 4.73 4.07 4.76 4. 45 
Extensional Orientation, Dimension, Sample 
Improvers 6.07 
Non-improvers 6.23 ч ^ 3 ^ Ж 
Attritions 5.92 5.01 5.03 3 P 
Total Means 


6.07 
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TABLE VI 


MEAN SELF-IDEAL CORRELATIONS FOR THREE GROUPS ON PRE-TEST 
=——————- 


Improvers Non-improvers Attritions 


Means .56 .55 29 


eet EN 


TABLE VII 


t- TESTS OF SIGNIFICANCE OF MEA 


N SELF-IDEAL COR- 
RELATIONS AMONG THE TH 


REE GROUPS 
Groups 
Groups Non-imp rovers Attrition 
Improvers 0.07 2.11* 


Non-improverg 


2.04* 
*Significant at the , 05 level of confidence 


Ly SELF-IDEAI, CORREL 
MONG THE THREE GROUPS MUTA 


Groups 
Groups Non-j mprovers Attrition 
Improvers 0.00 


3.24* 
Non-improvers 


3.24* 
*Significant at the .10 level of Confidence 
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TABLE IX 


t-TESTS OFSIGNIFICANCE AMONG THE M. 
EAN SCT SC 
OF THE THREE GROUPS suni 


_ эу ——————= 


Groups 
Groups Non-improvers Attrition 
Improvers 2.12 4.16* 


Non-improvers p. 28% 


e .01 level of confidence 


* Significant at th 
**Significant at the .05 level of confidence 


TABLE X 
CHI SQUARE ANALYSIS AMONG THE MEAN SCT SCORES 
OF THE THREE GROUPS 
Groups 
Groups Non-improvers Attrition 
Improvers 1.16 7, 28* 
2.14% 


Non-improvers 


£ confidence 


.01 level О 
onfidence 


* Significant at the 
er .05 level of c 


**gignificant at the 


TABLE XI 


CORRELATIONS BEFORE 
pan THE pROGRAM 


AND AFTER 
MEAN sELF-ID 
56 .25 
Improvers à 
«B4. 


Non-improvers .85 


273 


274 JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE XII 


CHANGES IN SELF-IDEAL DISCREPANCIES 


T- 
IN MEAN Q-VALUES FROM PRE- TO POST-SOR 
INGS FOR THE THREE GROUPS 


Dimensions — 
Improvers 


Non-Improvers 
Self Authority Student R 


== . ..TOn-mprovens — 8 
eader Бе Authority Student Reader 

Intensional 

Items 0. 09* 0.32 


0-88 dilo б ән 0.18 -0.24 
Extensional 


Items 


-0. 06 0.71 
Total Change 0.03 


-0.04 -0.10 -0.32 


1.03 0.89 0.16 


0.08  -0.56 


TABLE XIII 
MEAN CORRELATIONS OF PRE- AND POST-SORTS 
Self Ideal 
Improvers 58* 1 
А .76* 
Non-improvers . 64* 


.84* 
*Significant at the , 01 level of Confidence 
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ANALYSIS OF VARIANCES IN Q-VALUES FOR MA 


Source of 
Variation 


df Squares Square F P 


Samples (s) 
Persons in Samples (SP) 


Sum 


acc MN 


Orientation (O) 
Interaction (OS) 
Deviation (OSP) 


Sum 


Dimension (D) 
Interaction (DS) 
Deviation (DSP) 


Sum 


Interaction (OD) 
(ODS) 
Deviation (ODSP) 


Sum 


Total Variation 
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TABLE XVII 


SORT BEFORE INITIATION OF PROGRAM 


LE-FEMALE ON SELF 


Sum of Mean 


Independent Variation 


1 nil 
1 nil 


Orientation (O) 


44.97 14.99 
3 8.83 2.94 RM 
48 246. 30 5.13 
54 300. 10 


Dimension and Orientation (OD) 


38. 69 12.90 
14.95 4. 98 
48 251.36 5. 24 


6876 


.01 (0/05Р) 


01 (OSP/R) 


94 305.00 
Replication (R) 1296 5850.15 
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pe Mae Non-improver though not exhibiting 
tion sam 8 ensiveness in this respect as the Attri- 
(Table ple, did show more than the Improvers 
g SS V, VI, VII, VIII, IX, X). 
the mero ability of the Sort— The correlations of 
II) eese ts before andafter the program (Table 
program m the Ideal sorts before and after the 
above vd qnem XIII), indicateda consistency well 
lations ber expected by chance. Self-Ideal corre- 
Omer OTe aM alter the program (Table xD, 
apply to T suggested that this consistency did not 
ers, Th press but it did to the Non-im p r Ov- 
in байкен ormer indicated а considerable drop 
Change ations while the latter showed little 
discrep Improvers further showed an increased 
тар чар inthe areas in which they were most 
a Student. Self in Relation to Authority and Self as 
ing little | The Non-improvers although indicat- 
Crease e change did alter in the direction of a de- 
een oci р м in the areas where they had 
Read St defensive: Self as a Self and Self as а 
*r (Table XII). 
of the Validity of the Sort Sample—The analysis 
Broups T yielded relationships among the three 
found Pa died which were very similar to those 
(Tables. the analyses of the Self and Ideal sorts 
the fitis and X). This cross validation supports 
uring P a niy that both instruments were meas- 
Ñ icd they were designed to measure. 
Subjects ademic Differences—An analysis of the 
Chieve performance on various aptitude an 
Drovers wnt measures indicated that the Non-im- 
Eroups Scored much higher than the other two 
ч ë i XIV and XV). "n 
rov ssi es r T 
ed each ers and the Attrition cases TU 
ading ef- 


ее 
бате Ee of the program this re 
me the a as the Improvers improve? rove 
but log, 0P -improvers not only failed be nde 


t 
Ing skin 70 und in terms of the measu 


ed that although 


hi 

t e : 

he Impp, ade-point average indicat 
er scores than 


the N Provers had signifi low 
Nasunog rovers neers Š and achievement 
The At es they had a high grade-point average. 
4800 “ition group had the lowest achievement 
of Ppeq = the end of the program, the Improve, 
a the Айу а grade-point average similar to A 
09 the am Sample, but the Моп-іарготе е 
ade tritiongroupincreased slightly in ter 
ped average. de of 
fo, “Cade ifferences— An analysis was ma ao 
manco hie factors (Table XVI), and {ће іе 
еш the Self and Ideal sorts of the m 
есес © (Table XVII. No significa? 
Were found. 
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Discussion 


The purpose of this study was to investigate the 
relationship between self concept and achievement 
as demonstrated by Freshmen who came voluntar- 
ily to a college reading improvement program. Al- 
though the differential effects of the various self- 
concepts were thought by the experimenter and his 
colleagues to be operantin academic learning situ- 
ations of all kinds, the reading improvement pro- 
gram was chosen because of proximity and availa- 
bility. Inasmuch as a program such as this one 
**pressures"' an individual to change his patterns 
of reading, he would be expected to do something 
about this pressure. He could be **open" to the 
situation and change his reading habits in relation 
to the demands of the situation, or on the other 
hand, he could avoid meeting the demand to change 
by either distorting the situationby his perceptions 
of it, or by denying it entirely. According to Ro- 
gerian self theory, he would deny or distort the ex- 
perience as a defense against an inconsistency 
with self concept. He would consider it more im- 
portant to maintain a conception of self than to in- 
tegrate experiences which might necessitate chang- 
ing the concept. This condition occurs when the 
self concept is used as a defense again st threat. 
The theory, thus proposed, is that with all other 
factors equal, those who did something construc- 
tively from the experience, woul d demonstrate 
less defensiveness in their concept of self asa read- 
er than those who did not do as well. If an individ- 
ual’s concept of self as a reader were defensi ve, 
it would'be an expression of a more permeating 
defense system which would be manifested in self 
concepts other than that of reading. Three other 


studied. 
as were therefore Studie .-—— 
are efensiveness, inthis investigation refers to 


ions which are distortions in terms of 
seii e жү absolute, unconditional, unlimit- 
wer eralized. Defensiveness is an at- 
ceptof self. The anticipa- 
to this tends to make 
nt in his conception and 


Findings 
ae i sted by this study was that 
= proposition icant differences among the 
tions of the Improver, the Non-improv- 
This basic proposi- 
t of the five 


er, xi 
i found suppo 
tion has jtions an 


en obtained which supported the prop - 


There will be significant differ- 
general defensiveness such 
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ups will appear in the following 
perder poo diui a least defensive: At- 
trition, Non-improver, and Improver. 

The data from three different measures tend- 
ed to s upport this proposition: the Self and Ideal 
sorts, the Self-Ideal correlations and the Sentence 
Completion Technique. The three groups did ap- 
pear in the order predicted. : 

Proposition 2: There will be Significant differ- 
ences in the amount of defensiveness in the self 


in the following order fro 
least defensive: Attrition, 
prover. 


er, and Improver, 

The data from 
discrepancy meas 
This Proposition was th 
Study and it was Par ticularly meaningful that the 
defensiveness of this di i i 
the relative 
gram itself, 


will appear in the follow: 
fensive to least 


amount of d i 
Relation to Authori i 


her Study in 


ram 
who volunteer for a reading improvement prog 
and those who do not. 

Other findings were: 


; гош 
1. Stability of the Self-Ideal операда да DA 
before ће program to after it was “m group 
improvers but not forImprovers. The tal lations. 
exhibited a decrease in Self-Ideal corr ee to 
Upon further inspection this decrease sis ame in 
be attributable to an increased defensiv ared to 
those areas which this group had first ае Author- 
be defensive, namely, Self in Relation was little 
ity and Self as a Student. Althoughthere st all the 
change in the Non-improver group almo in defen- 
changes were in the direction of s a de- 
Siveness particularly in the areas of 3 Self as а 
fense which were Self as a Reader an и 
1 ievemen 
= In almost all the aptitude and амак eat 
measures used, the Non-improvers а terms 
cantly superior to the other two gr aups t differ- 
of Scores. Where there were no M nni signifi- 
ences there was a Strong tendency to war nterin£ 
cance. This was true as well for the effec- 
Equivalent Scores which measured dise Non-im 
tiveness. At the end of the p rogram t hen theY 
provers were lower than they had been w 
had started. e, the I^ 
3. In terms of the grade-point averag › evious 
Provers had the highest for the fusis k the pro" 
to the one in which they had enrolled semester 
Sram. The grades declined during the e G f 
in which the program was taken while 
the other groups increased slightly. 


Re-evaluation of the Theory 
en of the Theory 


py thi 
For the most part, the theory proposed РУ ге 
study has held. Apparently there was à "= 
lationship between defensiveness in the се in the 
Сері as a reader and relative par tore” aie f 
reading improvement situation. There оле a 
SO to be a relationship between ере gener? 
the perception of self in this area, and t defensive 
efensiveness was expressed as relative had volu 
ress among a group of students V йүс fi^ 
teered for the course. How representa was dil 
Students were of the college population g for v 
cult to зау, The nature of руте oe a 
Class itself may have been a selective de nteer e 
Опе reason ог another those who did yoran nis 
an investment in their performance in elf. Tín 
Sram above and beyond the reading ра Self 
Phenomenon may have been related to ig di? ёе 
Relation to Authority conception. In ee gated is 
Sion the groups did not dier, butal Mior tad 
fensiveness, Although an explanation colle’ as 
not feasible on the basis of the data nes dye 
there was Some indication that selective ef ne ich 
volved. The Attrition group, the most nial W” 
TOUBhout, resorted to the defense of de 
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1 


он, against any change їп achievement, 
and наа. ог anything else. Тһе Improver 
some deti ss groups, however, presented 
differed ү phenomena. The Non-improver 
“apos rom the Improver inself concept in that 
Self as as of defense were Self as a Reader and 
were а omg The only changes they experienced 
cy to be v in reading effectiveness and a tenden- 
iàlly in Fs ittle less defensive throughout, espec- 
explains inr most defensive areas. Hogan (5) 
is remov ra о авва decreases when the threat 
the RE rom the individual. In effect then, 
VT etin. он entered the program in which 
of ER rip pmi e was a threat to the concept 
the thre te reader. In a sense, in this group, 
gram T defeated. They went to the pro- 
Ception i ully and were able to reify their con- 
ment huit. themselves. The situation for the mo- 
laxed, ended. The defense could then be re- 
Ta = е other hand, the Improvers' area of de- 
portant s Self as a student. For them it was im- 
heir Деге commit themselves to the reification of 
area тнр іп the concept of self in the student 
end of nlike the Non-improvers, however, the 
the program did not eliminate the threat. 


S Hogan claimed (5: 420): 


pense requires further defense since 
balle is not resolved by defense unless 
enge is ceased or the threat resolved. 


pu 
а идеп! area was not directly effected by 
thr ing experiences except to produce further 
ding effective- 


еа 
ness A That is with increased rea | 
idea s "dying should become more effective. This 
сері, A consistent with the defensive self con- 
Ness in nother defense or increasing defensive- 
8d this this area was required. The data support- 
their idea. Not only did the grades drop but 
Post sorts indicated an increased defensive- 
he areas of greatest defense, namely, 
ed Student and self in Relation to Authority. 
Only jg Саа in this study clearly indicated that no 
that, ека Concept related to ас hievement, but 
terms of their conception of self, individ- 


a 
do, With а definite investment to perform | 
achieve all things being equal, thos? who do n" 
Chie, Choose not to do so, while those who 


» Choose to do so. 


the 


elf as а 


T Summary 
hi "T 
that t шау attempted to test Ше proposition 
ici! Pere, Old be significant differences in ^ 
prove €ptions of those who improve”, did pe 
the "meg апа dropped out in a college reading i 
е prop t Program, The data tende to supp? 
Position. Other findings suc changes 


in self concept and grade-pointaverage indi 

ge indicated 
further support for the theory that those who 
achieve as well as those who do not, do so as a re- 
sult of the needs of their own self system. 
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A COMPARATIVE STUDY OF TH 
E COLLE 
EXPERIENCES OF GRADUATES OF THE I 
CORE AND OF THE CONVENTIONAL 
CURRICULA 


RAYMOND F. GALE 
Ball State Teachers College, Muncie, Indiana 


much CURRICULUM has been the subject of 
three dec а and experimentation during the past 
been devel es. Many curricular patterns have 
to Баа as educational workers һауе tried 
terms of d ше DEWET, concepts of curriculum,in 
Into functi e learning experiences of the child, 
mong the onal forms for school programs. 
um, В "Se proposals has been the core curricu- 
defined гіеПу, the “core curriculum” has been 
lum eaten that part of the total school curricu- 
e feeds endeavors to assist pupils in meeting 
Without r most common to them and to society 
tion, »1 "еВага to any subject-matter classifica- 
Bressive Since the Eight Year Study” of the Pro- 
that the Education Association, which affirm ed 
flexible cae eee оноо! could provide à sound, 
uture tsa i to meet pupils’ present and 
al inter € needs, there has been much education- 
mentation, in the effects of curricular experi- 
ration 101 Upon the high school students’ prepa- 
еке life r and adjustment to the demands of col- 
тев 
Плов Ting Curriculum: at the Highland Park u- 
Progra igh School was established as an elective 
Zed dui іп 1943, The subject matter fields util- 
Grade 116 the last twelve years ave been: 
English. English and social science; Grade X. 
‘Story hes rel and biology; Grade XL, Amenta 
Nglish ( a American literature; and Grade 4^» 
Wasa lo Бгеаі books). The present investigation 
Core cu ngitudinal study of the gradua 
Colle Triculum during their mat ric 
entiona 


ulation at 
onv 


Col] е Purpo r 

e Se of the study was to C я 

the cor ехрегіепсев of s a e ted graduate? i 
Curriculum and from the convention 


* 
Fo 
Otnot 
ce 
S will be found at the end 0 


r this article- 


curriculum at the Highland Park, Illinois, High 
The investigation sought compara tive 
n to the following gen eral criteria: 
(1) college acceptance and matriculation, (2) aca- 
demic preparation for college, (3) scholastic 
achievements in college, and (4) extracurricular 
experiences at college. Specific aspects of college 
experiences included in the analyses of the two 
curricular samples were: acceptances for admis- 
sions, types and sizes of colleges attended, gen- 
eral academic preparation, first semester's 
grades, estimated scholastic ran k, academic 


honors attained, membership in [rate rnities or 
sororities, participation in campus activities and 
organizations, and the achievement of Special ex- 
tracurricular 


School. 
data in relatio 


recognitions. 


Procedures 
Frocet шт 


The subjects utilized in the study were select- 
ed at random from the graduates of the core cur- 
raduates of the convention 


rom the £ 
urriculum at the high school qur- 


iod 1947-1952. The samples, which, 
aduates in each curricular group, 
were equated on the bases of sex, year of gradua- 

Information, concern ing 
the first semester's at- 
was obtained from the grad- 


er 
Шека in October, 1954. The question- 
ifty-four percent re- 
the representivity of the 
tatistical calculations re- 
ces between the re- 


naire survey 
sponse. In relevance to 
uestionnaire sample, 5 

vealed no significant differen 
spondent sample and the surveyed population on 
dum criteria common to both groups. The data 
i ; i ere transferred to I. B. 
M cards for tabulations, which were subsequent- 
d statistically. 
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Summary of Findings 


arative analyses oí the data secured 

Pon iis high school records of the g raduates 
and the respondents in the questionnaire survey 
revealed no appreciable differences between the 
core and the non-core samples conce rning the 
college experiences selected for the study. In 
general, the incidences of college acceptan ces 
for the two curricular groups hadbeenquite sim- 
ilar, the matriculative patterns in regard to the 
types and the sizes of colleges attended had been 
somewhat congruous, the college preparation of 

the core and non-core groups had been very ac- 

cordant, and the scholastic and the soc iala- 

chievements of the two sam 


ples has beenquite 
comparable. The specific findings will be dis- 


cussed under the following captions: (1) College 
acceptance and matriculation, (2) Characteris- 
tics of colleges attended, (3) Preparationfor col- 


lege, (4) Scholastic achievements, апа (5) Extra- 
curricular experiences. 


College Acceptance and Matriculation 
——  “ Matriculation 


vate church School, 1. 
university, 40.0 and 32. 2; normal 
teachers’ college, 4.0 and .9. 


institution, 3. 0 and 2.8. 


non- 
percentage comparison of the c o re eed 
core graduates according to the size- ус рте 
tions of colleges attended. The respe non-core 
cents of enrollment for the core and the RAN 
graduates in colleges as classified by pos 1.999, 
ulation were: under 500, 7.0 and7.3; as ^ 2 000 
25.0 and 37.8; 2,000-4,999, 12.0 and 1 L31415 
-7,999, 13.0 and 9.3; 8,000-11, 999, W^ pe a 
12, 000-15, 999, 12.0 and 9.3; and 16,0 

ver, 13.0 and 11.3. 


Preparation for College 


: ion will 
The information presented in this s. ab d 
be focused on selected academic expe e items. 
the graduates as revealed by questionnai How ad- 
Answers to these questions were P a ular 
equately had the graduates in each cur k? What 
sample been prepared for colleg P WOR rticipa- 
had been the areas of deficiency during peni п ha! 
tion in college English courses? Ho d 
they achieved in these English courses? ple IV 
General preparation for college -- Tabl è. jg 
presents a percentage comparison of the opinion 
the non-core graduates concerning their for col- 
as to their general academic preparation icular 
lege. Only four graduates in each Cu as inad- 
sample indicated that their prepa1ation etm an 
equate. The respective percents for par opin- 
the non-core graduates, ac c ording to eni for 
ions as to degree of academic prepa 09 very ad; 
College, were: superior, 23.0 and 25. 5; 0 and 
equate, 38.0 and 45.4; and adequate, duates 
21.7. Thus, 88.0 percent of the core Date felt 
and 91. 6 percent of the non-core gr peior ade- 
that they had been adequately or more е. 2 
quately prepared academically for c oll = differ- 
general, the findings indicated very litt 
€nce on this factor for the two groups. courses” 
Areas of deficiency in college English rcd fc 0 
--How well had the respondents been prep? Since 
college in a basic course such as Епа pte? 9 
this subject had been common to the gra yea n 
both curricular samples during their е that 
at the High School, the investigator assu cance ! 
this factor might have particular s dco eV 
the evaluation of the Core Curriculum. re d 
Specifies a percentage comparison of c 0 pof w 
non-core graduates according to the at B. 
adequate preparation for college English d- 
e Percents of mention as indicated by i 
uates from the core program and from " cab; 
ventional curriculum were, respectively: ^4 2.15 
Шаху, 17.0 and 6.5; oral activities, 5-0 ;ction 0 
Inechanics of grammar, 33.0 and 31.1; a oo 
Word usage, 7.0 and . 9: general referent” o an 
Skills, 2.0 and 2.8; sentence structure, eat iva 
4.1; formal writing, 10.0 and 19.8; CT ^ o an! 
Writing, 14.0 and 23. 7; no deficiencies; Ў 
12.3; and, other inadequacies, 5.0 and ‚паг {0° 
Thus, the findings were somewhat 510) 
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TABLE I 


A PERCENTAGE COMPARISON OF CORE AND NON-CORE 
GRADUATES ACCORDING TO NUMBER OF ACCEPTANCES 
FOR ADMISSIONS TO COLLEGES 


Core Graduates* Non-Core Graduates** 
1947- 1950- 1952- Total 


Number of 1947- 1950- 1952- Total 
1953 Sample 1949 1951 1953 Sample 


acceptances 1949 1951 
д 7.7 13.9 10.3 


Опе TETTE E 9. 

WO 25.9 40 18.8 17.0 15.9 7-7 16.7 14.2 

Three 18.5 28.0 18:8 21.0 27.2 19.2 39.0 29.2 

Four 4.4 16.0 14 6 13.0 6.8 11.6 11 9.1 

Five „„ аш a ™ дє 7 es 57 
=== 2.1 1.0 wes Hg С 9 


Six or more ~~ 


i t respond. 
* 35.0 percent of the sample did n° 
** 30.2 Percent of the sample did not respond: 
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TABLE II 


-CORE 
A PERCENTAGE COMPARISON OF CORE AND NON-CO 


GRADUATES ACCORDING TO GENERAL TYPES 


OF COLLEGES ATTENDED 


n 


** 
e of Core Graduates* Non-Core Graduates = 
1947= 40502 dono. moi 
College 1947- 1950- 1952- Total 1947- 1950 i Sample 
Attended 1949 1951 1953 Sample 1949 1951 

; .9 
Liberal arts 48.1 28.0 22.9 31.0 45.4 34.6 22.2 34 
Men's college ... 16.0 83 во 9.2 11.6 11.1 10.1 
Women's 
college 7.4 20.0 6.3 10.0 20.5 3.8 13.9 14.1 
Private Church 5 
School --- 4.0 --- 1.0 = 3.8 --- " 
State college " 
Or university 33 4 32.0 47.9 40. 0 25.0 23.1 47.8 32. 
Norma] School 
or teachers: 
College 3.7... 63 40 --- 3.8 --- „9 
Technological 
institution 3.7 49 2.1 3.0 --- 3.8 5.6 2.8 
Other TERO Gae ui чае 2.2 7.7 = 2.8 


*7.0 of the Sample dig not Tespond. 
**3 7 of the Sample dig not Tespond. 
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TABLE III 


A PERCENTAGE COMPARISON OF CORE AND NON-CORE 
GRADUATES ACCORDING TO SIZES OF COLLEGES 
OR UNIVERSITIES ATTENDED 


aooo 


Student Core Graduates* Non-Core Graduates** 
1947- 1950- 1952- Total 


Population 1947- 1950- 1952- Total 
of Colleges 1949 1951 1953 Sample 1949 1951 1953 Sample 
Under 500 зт 120 63 7.0 68 1L6 5.6 T3 . 
500-1, 999 35.9 40.0 16,7 25,0 50.0 30.7 218 31.8 

2, 000-4, 999 7.4 12.0 146 12.0 9.2 3.8 16.7 10.3 

5, 000-7, 999 14.8 240 6.3. 13.0 9.2 15.4 5.6 9.3 
8,000-11, 999 92.2 8.0 10.4 13.0 11.4 19.2 13.9 141 
12,000-15, 999 37 4.0 208 120 6 8 7.7 13.9 9.3 


18.8 13.0 11.4 3.8 16.7 11.3 


16, 000 and over 11.1 40 . 


i d. 
* 7. 0 percent of the sample did not respon 
** 4.0 Piceni of the sample did not respond. 


TABLE IV 


ON-CORE 
ом OF CORE AND NOR 
A PERCENTAGE COMPARIS COP INIONS AS 20 
GRADUATES CE ND REPARATION FOR COLLE 


Uper 8.8 
БЫ 95,9 28.0 1 54.5 346 41.7 45.4 
ery A 35.4 38.0 
dequate 44.4 36.0 | T" 35.4 30.5 21.1 
deq 31 3 21. 
аа 18.5 28.0 " a = 28 * 8 
BA | sem 2.8 9 


* 
*%* s 0 percent of the sample did пй € 
-T percent of the sample did n° 


ond. 
ond. 
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TABLE V 


A 


Tea of Least 


** 
Core Graduates* Non-Core Graduates Total 
Adequate 1947- 1950- 1952- Total 1947- 1950- 195 
Preparation 1949 1951 1953 S 


Sample 
ample 1949 1951 1953 


4.8 24.0 14.6 


6.5 
17.0 6.8 т.т 5.6 
2.7 
Oral Activities ТА &D аз ва 2.0 зв 2.8 
1 
Mechanics of Grammar 33.3 28.0 35.4 33.0 34.0 11.6 4.7 " 
Diction or Word Usage  ... 


9 
0 2.2 vice 7 
General Reference 2.8 
Book Skills 7^" 4&0 24 2.0 2.2 Ton did | 
Sentence Structure 18.5 16.9 10.4 14.9 9.2 ът 250 241 
>$ 8 
Formal Writing Se 8.0 16.7 10.0 18.4 15.4 25, 0 " 
Creative Writing 18.8 ро 12.5 149 20.5 30.7 22.2 m 
No Deficiencies 7.4 90 10.4 9.0 11.4 23.1 T #4 
Other Inadequacies 3.7 12.9 2.1 ` 6.5 
. . 5.0 


* 16.0 


percent of the s 
** 


ample diq 
7.6 percent of the Sample did mot eM, 
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the 

‘in A — except for perceptible deviations 
felt less fusi Evidently, the core graduates had 
formal writi equacies in creative writing and in 
been less pore while the non-core graudates had 
word Usa eficient in vocabulary and in diction or 
had been ge. The most discernible deficiencies 
specified in mechanics of grammar, which were 
each Баи, inadequacies by about опе thirdin 
for both и Other significant areas of concern 
cabular groups had been sentence structure, vo- 

y, creative writing, and formal writing. 


prides achieved in English Courses ая re 
tents ofthe nat had been the relative achieve- 
in buds core and the non-core respondents 
cording to courses as college F r es hmen? Ac- 
he core a тара ут, the respective percents for 
to grades nd the non-core graduates, in relevance 
uring th achieved in college E nglish courses 
40. 0 and pa first year, were: À or equivalent, 
9r equival .0; B or equivalent, 38.0 and 34.0; C 
and 4, 6. ue 28.0 and 36. 0; D or equivalent, 3.0 
Course bi ailure, 0.0 and .9; exc used from the 
ecause of significant scores on tests, 10.0 


and 3, g; 
n and course not required, 1.0 and .9. 
two general, the achievement patterns for the 
ns istent. Of 


s 
the gambles had been somewhat c 0 
dents who had 


e 
uates vno 918 sample, 5.5 percent of th à 
Courses ^ mae been enrolled in Freshmen English 
valent attained less than a C grade Ог the equi- 
cused fr Every tenth core respondent had been ex- 
fa si Ош the required English course pecause 
true £ gnificant test score; this facto” had been 
very ty the non-core sample i e case in 
fated Oc aidan The percentage findi 
SCeive at the non-core respondents h tended to 
gra uat more A and C grades than ha 
es, while the respondents f r 0 the core 
the B clas- 


"пре | 
5шсанор achieved more grades in 


Scho 
last: 
astic Achievements 


Ho 
T Te ie p^ the scholastic achievement $ | 
% Sir coll he non-core graduates comp wer 
9 this gu EE matriculation? To see апап a. 
(1) ri d'eStion, the following criteria were us? 
Choo] "ha gla ll grades as rep 
att us colleges, (2) estima ^^ 
eq tied n graduates, and (3) scholasti¢ 
wi from t hs data for thefirst topic We ords 
ms is cumulative HighS cho? jan ч 
estionn 1886 two items were included 1 
lr . 

хаа semester's grades as report igras ti 
Асага Y the colleges--How had gechol er a 
“ate os of the core and the non-core Pa 

mpared during their first semester ? 


orte 


lege work as revealed by grades? I i 

data, the investigator esed that аре. a 
influences of the High School had carried, for б 
most part, through the first semester's wa rk at 
the college level; after that, possibly, the schol- 
astic forces in the college environment had tended 
to become stronger. Table VII, whichfollows, e- 
numerates a comparison ofthe mean grad es a- 
chieved by the core and the non-core graduates as 
college Freshmen during their first semes ter 

Since English, science, and social science had 
been common to both curricular sam ples during 
high school attendance, these subjects were se- 
lected for comparative purposes. Thefourth entry 
specified the mean grades for all of the subjects 
in which the graduates had been enrolled. 

In general, the findings revealedthat the ranges 
of the mean grades for the subgroups in the core 
curriculum sample were: English, 3. 3-3.4; Sci- 
ence, 3.2-3. 5; social science, 3.3-3. 5; and all 
subjects, 3. 2-3.5. Corresponding data forthe 
subgroups in the regular curriculum sample were: 
English, 3. 1-3. 5; science, 3.3-3. 5; social sci- 
ence, 3.3-3.4; and all subjects, 3.4-3.9. The pat- 
terns of nglish, science, and soc- 


achievement їп E Я 
sistent for the core and 


» classification and in sci- 


ence, the small positive differences in the means 
in favor of the non-core graduates, while 
d held a slight advantage in 
e. In general, the dif- 
the achievements Í o r the two groups 

i ir scholastic 


.4. 
In the “all subjects’ 


English, science, 
nted slightly above ''av- 


d “ап subjects” represe ye 
= e” achievement or a grade of C as originally 
bis tile ranks--How had 


ized. 

Estimated scholasüe Do cur I 
c status of thetwo curricular samples, 
ile ranks, compared in col- 
questionnaire asked: “To 
nthe basis of schol- 
f your (class, 
ank?” Table УШ 
parison of the findings 
. The respective 
and the non-core 


heir estimated class 


), in wh 
class) (d 


1.0 and 34.0. 
12.0 percent ofthe respon- 


$ Eo qp had ranked in the upper 
scholastically, while 

-core respondents had designated 
. One person in the core group 
: the lowest quarter; four responde nts 


had been i 
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TABLE VI 


A PERCENTAGE COMPARISON OF CORE AND NON-CORE GRADUATES 
ACCORDING TO GRADES ACHIEVED IN FRESHMAN 
ENGLISH COURSES IN COLLEGE 


———— 

Grades in Core Graduates* Non-Core Graduates ** 

Freshman 1947- 1950-  1952- Тою 1947- 1950-  1952- Total 
English 1949 1951 1953 Sample 1949 1951 1953 Sample 

A or equivalent 22.2 16.0 8.3 14.0 15.9 23.1 11.4 16.0 

B or equivalent 37.0 32.0 41.7 38.0 36.8 30.7 33.3 34.0 

C or equivalent 22.2 32.0 29.2 28.0 34.0 30.7 41.7 36.0 

D or equivalent 8T 4.0 253 3.0 2.9 --- 11.1 em 
Failed the course --- ues — M 2.2 "S е 9 


Excused from the course 11.1 


Course not required === sas 2.1 1.0 


* 8.0 percent of the 


sample did not res ond. 
** 4.7 percent of the P 


sample did not respond. 


E 


— Tl ч=— 
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TABLE VII 


A COMPARISON OF MEAN GRADES ACHIEVED BY CORE AND 
NON-CORE GRADUATES AS COLLEGE FRESHMEN 
DURING THE FIRST SEMESTER 


Core Graduates (b) Non-Core Graduates (b) 

I. Mean Grades (a) 1947- 1950- 1952- 1947- 1950- 1952- 
1949 1951 1953 1949 1951 1953 
English 3.4 3.3 3.3 3.5 3.1 3.2 
Science 3.5 3.2 3.4 3.5 3.3 3.5 
Social Science 3.5 3.4 3.3 3.3 3.3 3.4 
. All Subjects 3.5 3.3 3.2 3.9 3.4 3.5 


Core and Non-Core Samples 


Il. Differences in Means (c) 1947- 1950- 1952- 

1949 1951 1953 
——————————————————— HÀ 
English .1(NC) .2 (C) EEC) 
Science == .1 (NC) .1 (NC) 
Social Science .2 (C) .1 (C) .1 (NC) 

A11 Subjects .4 (NC) .1 (NC) .3 (NC) 


(a) Grades ranged from a possible low of 1.0 to a possible high of 5. 0. 

(b) Sample: 1947-1949, 101; 1950-1951, 66; 1952-1953, 72. 

(c) Positive differences in means in favor of the core or the non-core 
graduates are indicated by (C) and (NC), respectively. 


TABLE VIII 


A PERCENTAGE COMPARISON OF CORE AND NON-CORE GRADUATES 
ACCORDING TO SCHOLASTIC QUARTILE RANK IN COLLEGE 


* Non-Core Graduates ** 
Core Graduates TT 1952- Total 


Estimate of 
5 i 7- 1950- 1952- Total 
E E 1951 1953 Sample 1949 1951 1953 Sample 
7.7 --- 3.8 
Lowest Quarter --- “= 2.1 1.0 4.6 
33.3 17.9 
Second Quarter 25.9 16.0 14.6 18.0 11.4 бї 
Third Quarter 29.6 44.0 333 35.0 41.6 30.7 30.5 37.9 
: 33.3 34.0 
Highest Quarter 33.3 28.0 43.8 37.0 29.4 42.3 


ple did not respond. 


+ f the sam 
9.0 percent o Dle did not respond. 


** 7.6 percent of the sam. 
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TABLE IX 


A PERCENTAGE COMPARISON OF CORE AND NON-CORE GRADUATES 
WHO WERE HONOR STUDENTS SCHOLASTICALLY IN COLLEGE 


Core Graduates* Non-Core Graduates** 
1947- 1950- 1952- Total 1947- 1950- 1952- Total 


Honor Student 1949 1951 1953 Sample 1949 1951 1953 Sample 
-—————Ó— ee ee 
Yes 25.9 28.0 31.3 29.0 31.3 34.6 27.8 31.1 

No 55.5 56.0 41.9 52.0 54.5 42.3 52.8 50.0 


No official recognition 3.7 4.0 
by the college 
—————»——— L (C z — — = — 


* 12. 0 percent of the sample did not respond. 
** 9.4 percent of the Sample did not respond. 


TABLE X 


A PERCENTAGE COMPARISON OF 
ACCORDING TO MEMBERS 
SOCIAL FRATER 


CORE AND NON-CORE GRADUATES 
HIP AND NON-MEMBERSHIP IN 
NITIES OR SORORITIES 


= -— —  rOso-@ === 
Core Graduates 


Non-Core Graduates 
ee —____Non-Core Graduates _ _ 
1947 1950- 1952- Total 1947-  1950- 1952- Total 
Status of Membership 1949 1951 1953 


Sample 1949 1951 1953 Sample 


i 


Membership 40.7 56.0 47.9 


48.0 568 65.3 58.3 59.2 
Non-Membership 18.5 32.0 35.4 30.0 18.4 3.8 167 14.2 
Unavailable at the col- 

lege or university 29.6 4.0 10.4 140 22.8 23.1 95.0 23.6 
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from the non-core sample had specified the same 
standing. For the most part, the data indicated 
no appreciable differences between the core and 
the non-core samples in relevance to the respon- 
dents' estimated scholastic class ranks during 
college attendance. 

Scholastic honors--In high school, 27.6 per- 
ге of the соге graduates апа 28.4 percentof the 
RUE graduates from a total sample of 239 in 
Mm sn group had been selected for mem- 
had je in the National Honor Society, th e award 
tn jm made on a basis of scholarship, charac- 
bea! eadership, and service. Thus, there had 
pl n по perceptible differences for the two sam- 
р es іп relevance to the highest recognition ac- 

Orded to high school seniors. What had been the 
етв at the college level? Ап item in the col- 
Wer Section of the questionnaire asked: “(Are, 
som you an honor student scholastically? (e. 8- 
etc je bs dean's list, graduate with honors, 
dion findings, submitted in Table IX, revealed 
perc .0 percent of the core graduates and 31.0 
to es of the non-core graduates who responded 
come, Questionnaire survey had been officially re- 
Ph aes as honor students during their matricu- 
enen at colleges. These data indicated that, in 
two ша, {һе percentages of incidences for the 
nias eee in the attainment of scholastic recog- 
were S during college were similar; the findings 
wo Very consistent to those reveal ed for the 
Mona mu in regard to membership 1n the Na- 

E onor Society at the high school level. 
had wiracurricular experiences in college-- What 
Core een the extracurricular experiences of the 
Se u. the non-core graduates in college as re- 
be de by questionnaire items? This section will 

e ві в to а discussion of certain facets from 

ents ege extracurricular lives of the respon- 

ershi Captioned: (1) Membership and non-mem, 

авн. In social fraternities or sorori ties, (2) 

ions ipation in campus ac tivities and organiza- 
and (3) Special recognitions. 
erne сїр and non-membershi 

e con or sororities--Table X shows a perce! 
8radua mparison of the core and the non-C0 re 

ershi tes according to membership and non-mem- 
соп ip in social fraternities or sororities during 

ege. 

Approxim 

e ately one half of the 
| ;Onged to mem decem or sororities during col- 


i е 
"esp. Whereas about three fifths of the non- ^ ore 
the ?ndents had been members. More than twice 
non-core 


то сЁаге of соге graduates than 1 
a n dents revealed that they had not belonged 
facto, aternity or to a sorority. In regard (0 115 
23. Т of availability, 14.0 percent of the core апо 
Specia ent of the non-core sample m poe 
ч avana that fraternities or sororities ha EN 
in eff able at the colleges of their attendance. 9^» 
ect, the availability had been more limite 


pinsocial fra- 
ercen- 


core sample had 


Tes 


for the non-core respondents, yet they had shown 
a substantially larger membership in social fra- 
ternities and sororities than the core graduates. 

Participation in campus activities and organi- 
zations- What had been the social activity pat- 
terns of the two curricular groups during college? 
Table XI shows a percentage compar ison of the 
core and the non-core respondents according to 
their participation in campus activities and in or- 
ganizations during college. ` 

For the most part, the patterns of participation 
in activities classified under ten arbitrary cate- 
gories were quite similar for the two samples ex- 
cept for discernible variations in dramatics, re- 
ligious activities, sports, and student govern- 
ment. The non-core respondents reported more 
appreciable incidences of participation in sports, 
dramatics, and student government, whereas the 
core graduates tended to have been associated 
more readily with religious activities or organi- 
zations. 

Special recognitions--How did the social dis- 
tinctions attained by the two curric ular groups 
during college compare? Table XII specifies that 
the respective percents for the core and the non- 
core samples as related to the attainments of spe - 
cial recognitions at college were: class officers, 
10. 0 and 9. 3; officers of non-honorary organiza- 
tions, 27.0 and 32. 1; me mbers of honorary or- 
ganizations, 17. 0 and 11. 3; special awards, 17.0 
and 15.8; miscellaneous, 0. 0 and 2.8; and none, 


34. 0 and 42. 5. 
Thus, about one third of the core graduates and 


approximately two fifths of the non-core respon- 
dents had received no special recognitions, In re- 
gard to class officers and special awards, the two 
samples had been about equally represented. The 
core respondents had tended to receive more re- 
n honorary organizations, while the 
non-core graduates had been selected more fre- 
quently as officers in non- honorary organizations. 
In general, the incidences of special recognitions 
within the two samples were somewhat consistent. 


cognitions i 


Conclusions of the Study 


Some of the findings of the study are of a fac- 
tual nature; other data represent the opinions of 
graduates. Three general conclusions, which 
emanated logically from the data of the study, are 
listed below: 

1. The core graduates had been as well pre- 
pared for college matriculation as had the non- 
core graduates. --The core graduates had been 
accepted by colleges as readily as the graduates 
of the conventional curriculum, and they had at- 
tended colleges as frequently as the non-core 
members. Although the attendance patterns of the 
two samples differed somewhat in relation to the 
general types and the sizes of colleges, apparent- 
ly the desires or the preferences ofthecore grad- 
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uates had not been restricted. Approximately 
nine tenths of the graduates in each sample felt 
that their general college preparation had been 
adequate or more than adequate. About one third 
of each curriculum sample specified inadequacies 
in mechanics of grammar inrelationto Freshman 
English courses; the general patterns of deficien- 
cies in English were similar for the core and the 
non-core groups. 

2. The core graduates had achieved academ- 
ically in college as well as had the non-core grad- 
uates. --This conclusion has been reached in the 
light of significant evidence secured from the 
questionnaire and the high school records. The 
mean grades achieved by the core and thenon- 
core graduates during their first semester as col- 
lege freshmen were very much alike. The find- 
ings of the questionnaire included no discernible 
differences in the academic achievements of the 
two samples on the criteria of estimated class 
ranks and scholastic recognitions. 

3. The social achievements of the core and the 
non-core graduates during college attendance had 
been somewhat similar.-- Approximately half of the 


core sample had belonged to fraternities or sor- 
orities during college, while almost three fifths 
of the non-core graduates had been members of 
these social organizations. About two-thirds of 
the core and three fifths of the non-core graduates 
had received special recognitions of a social na- 
ture during their college years. Thecore sample 
members had tended to receive more distinctions 
in honorary organizations, whereas, the non-core 
graduates had been selected more frequently as 
officers in non-honorary groups. 


FOOTNOTES 


* Adapted from a dissertation presented in par- 
tial fulfullment for the degree of Ed.D. at 
Northwestern University. 

1. Dictionary of Education, p. 114. Prepared un- 
der the auspices of Phi Delta Kappa, Carter 
V. Good, editor. New York: McGraw-Hill 
Book Co., Inc., 1945. 

2. Wilford Aikin, The Story of the Eight-Year 
Study. New York: Harper and Brothers, 1942. 
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A STUDY OF THE VIEWPOINTS HELD BY 
SCHOOL AMDINISTRATORS REGARD- 
ING VOCATIONAL EDUCATION IN 
THE SECONDARY SCHOOL 


FRANK J. WOERDEHOFF and RALPH R. BENTLEY 
Purdue University 


SEC TION I 
INTRODUC TION 


айаш EDUCATIONAL viewpoints held by school 
torsi om are presumed to be important fac- 
Sean etermining curriculum offerings in the 
practi ary school. Of course, the edue ational 
Оша їп the school system may or may not 
adm inist to the educational viewpoints held by the 
Moe] DH It must be recognized that the 
educati administrator cannot always translate his 
og b pie viewpoints into practice because ofa 
раро PHONE: Nevertheless, in terms of 
inaf € inference, the school administrators are 
Curri avorable position to exert influence on the 
Sequ culum design of the secondary school. Con- 
ker m it is reasonable to assume that their 
ute Points regarding vocational education contrib- 
jeti uch toward the degree of acceptance or re- 
E 9n of this phase of secondary education and 
oes in which the program is carried out. 
e Е ane this study was undertaken to secure 
rega ru poimia of Indiana school administrators 
tion.) Jn pertinent questions dealing with voca- 
nàl education, 


Description of Study 


ыла study was cooperatively planned and car- 
ers Out as a team research project by three mem- 
Educ of the Purdue University vocational teacher 
i faculty representing vocational авгі- 
edueatic education, vocational home economics 
T ion, and trade and industrial education. 
? study received the endorsement and support 
° presidents of the following organizations: 


1. Indiana Association of Town and City Super- 


2 intendents, 

* Indiana County Superintenden 
3 tion, and 
` Indiana Association of Secondary School 


Principals. 


ts’ Associa- 


Purpose of This Study 


The purpose of this study was to discover the 
viewpoints of Indiana school administrators re- 
garding (1) vocational educationin general, (2) vo- 
cational agriculture, (3) vocational home econom- 
ics, and (4) vocational trade and industrial educa- 
tion, and to determine whether there were signifi- 
cant differences among school administrators cat- 
egorized according to (1) type of administrative 
position, and (2) experience with vocational edu- 


cation programs. 
Definition of Terms 


Certain terms used in this study are defined as 

follows: 

Viewpoint. The term ‘‘viewpoint’’ may be de- 
fined as an affectively toned idea or group 
of ideas predisposing a person to action 
with reference to a specific object. 

School Administrator. The term ‘‘school ad- 
ministrator" in this study will refer to 
(1) county superintendents, (2) superin- 
tendents of independent school districts 
(city and town), (3) city secondary school 
principals, and (4) county secondary school 
principals (principals who administer 
schools under the jurisdiction of county 
superintendents). 

Vocational Education. The term ‘‘vocational 
education" as used in this study refers to 
three of the four vocational education pro- 
grams in the secondary schools which are 
approved for federal reimbursement by 
the Indiana State Department of Public In- 
struction. These vocational education pro- 
grams include home economics education, 
agricultural education, andtrade and indus- 


trial education. 


Research Procedure and Sample 


The data upon which this study was based were 
collected by means of a questionnaire designed 
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TABLE I 


NUMBER OF SECONDARY SCHOOL PRINCIPALS BY 
SIZE OF SCHOOL WHERE EMPLOYED 


Number of 

Size of School Principals 
Less than 100 169 
100 - 249 185 
250 - 499 80 
Over 500 _80 
Total 514 


TABLE II 


NUMBER OF SECONDARY SCHOOL PRINCIPALS HAVING AND NOT 
HAVING EXPERIENCE WITH VOCATIONAL EDUCATION 


Experience 
Number Number Not 
Vocational Area Having Having 
Vocational Agriculture 392 122 
Vocational Home Economics 433 81 
Vocational Trade and 
Industrial Education 132 382 


TABLE III 


NUMBER OF ADMINISTRATORS BY TYPE OF ADMINISTRA- 
TIVE POSITION 


Administrative Position Number 
County Superintendents 74 
City Superintendents 124 
County Principals 372 
City Principals 142 
Total 712 
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(1) to obtain personal data concerning type of ad- 
ministrative position, and experience with voca- 
tional educational programs, and (2) to obtain the 
school administrators! viewpoints regarding voca- 
tional education in general, vocational agriculture, 
vocational home economics, and vocational trade 
and industrial education. 

The items in the questionnaire were prepared 
by the investigators and were reviewed by Purdue 
University Division of Education faculty members 
concerned with school administration, general 
secondary education, and vocational education. 
The items were revised in light of thecriticisms 
and suggestions of staff members. 

The questionnaire included 112 items which 
were distributed as follows: (1) 20 items on gen- 
eral vocational education, (2) 30 items on voca- 
tional agriculture, (3) 30 items on vocational home 
economics, and (4) 32 items on vocational trade 
and industrial education. An opportunity was pro- 
vided for the respondents to indicate whether they 
strongly agreed, agreed, were undecided, dis- 
agreed, or strongly disagreed with each item. 

The questionnaires were sent to all of the 
county and city superintendents and secondary 
School principals in Indiana. Of the 1027 ques- 
tionnaires sent, 712 or nearly 70 percent were 
returned. The distribution anddescription of the 
administrators are shown in Tables I, II, III. 

The data were tabulated for r es pondents cate- 
gorized according to (1) type of administrative po- 
sition, and (2) experience with specific vocational 
education programs. The responses, * strongly 
agree’’ and ‘‘agree’’, were combined as were the 
responses ‘‘strongly disagree” and ''disagree." 
Percentages of responses were computed for the 
administrators in each of the above categories 
and for the total group. 

The chi-square technique was used to ascer- 
tain whether or not there were significant differ- 
ences in the responses to each item wherever the 
numbers were sufficiently large. Comparisons 
were made between: 


1. County superintendents and city superintend- 
ents, 

2. County principals and city principals, 

3. County superintendents and county princi- 
pals, 

4. City superintendents and city principals, 

5. Secondary school principals who have and 
have not had administrative experience with 
vocational education programs. 


SEC TION II 
ANALYSIS OF DATA 
The results of this study are chown in Tables 


IV,V,VI,and VH. These tables show the percent- 
ages of administrators who agreed, disagreed, 


were undecided, or did not res pond to the items 

in the questionnaire dealing with various aspects 

of vocational education. Also, these tables show 

the significant differences between the responses 

of administrators when grouped according to type 

of position. In Tables V, VI, and VII, the signifi- 
cant differences between administrators who have 

and have not had experience with vocational educa- 
tion are shown. 


Vocational Education 


The findings shown in TableIV reveal that 80 per- 
cent of the administrators believed that the suc- 
cess of a local program of vocational education de- 
pends largely upon the degree to which they en- 
courage and support the program. Among these 
administrators 27 percent believed that vocational 
education caused too many administrative prob- 
lems. Nevertheless, 91 percent believed that vo- 
cational education should be provided in the high 
school and 84 percent agreed that skills for earn- 
ing a living are as important as skills for social 
living. Furthermore, 94 percent believed that vo- 
cational education courses deserve credit equal to 
academic courses in the curriculum. Only five 
percent expressed the view that bright pupils should 
be discouraged from taking vocational courses. Sev- 
enty-one percent of th e administrators indicated 
that the per pupil cost for vocational education 15 
justifiable and 82 percent believed that the enroll- 
ment per class should exceed25 pupils. The views 
administrators expressed varied widely with re- 
gard to all pupils being interested in vocational 
Subjects, the socio-economic level of vocational 
education students, the necessity of vocational ed- 
ucation for all pupils and whether vocational edu- 
cation should be general or specific. 

Approximately 50 percent agreed that fede ral 
funds are desirable to finance vocational education. 
The majority of the administrators were uncertain 
or were in disagreement concerning the extent to 
which the State Department of Public Instruction 
exercises control over federally reimbursed voca- 
tional education programs. Fifty-six percent be- 
lieved that state and federal funds should be avail- 
able to match equally local school funds for th® 
travel costs of teacher supervision of pupil pro- 
jects. Two-thirds of the administrators were op- 
posed to state and federal agencies setting time al- 
lottments for classes іп vocational education. HOW- 
ever, 82 percent agreed to having the State Depart- 
ment of Public Instruction set standards and ap- 
prove local vocational education facilities. 


Vocational Education in Agriculture 


Table V shows that the vast majority of admin- 
istrators agreed that vocational agriculture shoul 
be an elective course, and, consequently, did not 
agree that freshmen boys in rural areas nor 
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farm boys should be required to take courses in 
vocational agriculture. Eighty-six percent 
agreed that farm shop instruction should be given 
when a school maintains a department of vocation- 
alagriculture. Nearlythree-four ths favored a 
state course of study for vocatio nal agriculture. 
There was divided opinion among administra tors 
oe the practice of enrolling only those stu- 
ents in vocational agriculture who have facilities 
vec mp for supervised farm practice. Ninety- 
bs percent of the administrators believed that the 
eacher of vocational agriculture should visit his 
on on their home farms inorder to supervise 
1 rm practice work, and 89 percent believed thatat 
east three such visits be made by the teacher each 
iid However, only 66 percent agreed to recog- 
С time needed for making farm visits as a part of 
bere шге teacher’s workload. Then, too, 92 
of эы, regarded field trips аз ап essential part 
ut od instruction in vocational agriculture, and 
бы enl agreed that school owned and operated 
чаи Should be used for making field trips. Sixty- 
Scho a meg: of the administrators agreed that а 
freie that maintains a department of vocational 
ica oo should have a Future Farmers of Amer- 
mini apter. More than three-fourths of the ad- 
Seah MS believed that the Future Farmers of 
esir ica organization aids students in developing 
and 23016 social, civic, and vocational interests 
of opi ilities. There was considerable difference 
йе др among administrators regarding the 
rom ication of pupils and teachers being absent 
Vosa ae to participate in F. F.A. and other 
zc agriculture activities. 
that thee fourths of the administrato 
tional e cost for facilities and equipme! 
Partm agriculture could be justified 
вене of vocational agriculture in e 
thou лү Should be maintained where neede е! 

3 they are not reimbursed by federal in e 
ana Hobo Was no common agreement among m 
respo ool administrators regarding the sc * x 

fo nsibility for organizing and conducting clas 
in; " young and adult farmers and regarding f 

Sate agencies other than the publi 
Р е ae this responsibility- 
istzato findings indicate that these 
Sac ете Саан that y 
as re as well qualified an 

Tre Other loads In the secondary school. 


Ei 
worp ive percent did not believe that iha diy 
Ereater Oad of vocational agriculture Pria Er. 


Cen than that of other teachers, and 
empiaereed that agriculture teachers shoul An 
автосу d for twelve months. Only 60 per should 
be res that vocational agriculture aod 
Pale for 4-H clubs in the 
tha in €e-fourths of the administrators ^ tment 
9f voc each school which maintains адерагі" 
Sho ational agriculture an advisory 
Tic tng appointed to work with the teacher 
Te and the administrator- 


rs indicated 
nt for voca- 
and that de- 
Indiana high 


Vocational Education in Home Economics 


The viewpoints held by Indiana school adminis- 
trators regarding Vocational Home Economics are 
shown in Table VI. In one instance, nearly three- 
fourths of the administrators indicated that they be- 
lieved that vocational homemaking courses should 
be elective, and in another instance, 67 percent 
indicated that homemaking courses s hould be re- 
quired of all girls. Fifty-one percent believed that 
homemaking education is as importantfor boys as 
for girls. However, 90 percent indicated that boys 
as well as girls should have family living courses 
in order to prepare them for their responsibilities 
as homemakers. Approximately 95 perc ent of 
the administrators agreed that education for home- 
making is as important for girls of superior abil- 
ity as those of lesser ability regardless of their 
socio-economic background. 

A large majority of the school ad m inistrators 
believed that directed home and community learn- 
ing experiences should be required of each voca- 
tional homemaking pupil. 


Eighty-three percent agreed that a good home- 
making program will include provisions for home 
visitation to help students withtheirhome projects, 
and 77 percent believed that vocational homemak- 
ing teachers should visit their students in their 
homes at least twice a year. 

It was found that nearly 80 percent agreed that 
travel expenses should be provided to encourage 
teachers to visit the homes of their students. Only 
66 percent of the administrators would consider 
the time required for making such visits as a rec- 
ognized part of the homemaking teacher's load. 

It may be observed from Table VI that approxi- 
mately four-fifths of the administrators agreed 
that the Future Homemakers organization isa 
worthwhile organization which provides situations 
that help students to further develop their leader- 
ship abilities and homemaking skills and abilities. 
Seventy-one percent believed that each department 
of homemaking should organize a F. H. A. chapter. 
In contrast the administrators were less inclined 
to regard the vocational homemaking teacher as 
being responsible for the girls' 4-H club work in 


the community. 

Sixty-seven percent of the administr ato rs 
agreed that it was desirable to hav e an advisory 
committee to work with the homemaking teacher 
in each school that maintains а v OC ational home- 
making department. The data indicated that about 
one out of every five Indiana school adm in istra- 
tors were undecided about the desirability of this 

ractice. х 

There appeared to be much uncertainty among 

administrators regarding the public scho ols’ rë- 
ility for adult homemaking education. Only 

the administrators agreed that class- 

hould be conducted in 


at maintain departments of vocational 
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homemaking. 

The data in this study indicated that school ad- 
ministrators regard vocational homemaking teach- 
ers as being as well pr e pared for their jobs, as 
cooperative, and performed as capably in teach- 
ing situations as other teachers. The teaching 
load of vocational homemaking teachers was re- 
garded by 70 percent of the administrators as at 
least equal to that of other teachers in the high 
School. Two-thirds believed that teachers of 
homemaking should be employed for a period of 
time extending beyond the school year. 


Trade and Industrial Education 


In Table VII the viewpoints of Indiana School ad- 
ministrators regarding trade and industrial educa- 
tion are shown. Approximately 70 percent of 
these administrators believed that trade and in- 
dustrial education should be included as a part of 
the high school program and only 13 percent be- 
lieved that this vocational educational program 
was too costly. Over 75 percent believed that the 
objective of trade and industrial education should 
be to prepare young peoplefor useful employment 
and that the instructional equipment and tools 
should be comparable to thetype used in industry. 
One-half of these administrators agreedthat high 
School girls should be provided an opportunity to 
enroll in industrial education training programs. 

There seemed to be much uncertainty among 
administrators asto whether the public school or 
industry should train adult workers. This is evi- 
denced by their responses to the adult education 
items shown in Table VII. The administrators ex- 
pressed viewpoints which indicated that they were 
uncertain or did not believe that the public school 
had a responsibility for training out-of-school 
youth, semi-skilled workers, apprentices, and 
supervisory and foremen personnel. 

Although there was considerable uncertainty 
among administrators’ opinions regarding the pub- 
lic schools’ responsibility for providing vocation- 
al trade and industrial education through coopera- 
tive work-education programs, the majority opin- 
ion would support such a program. Tne majority 
opinion likewise favored having a specially trained 
teacher-coordinator for the program, agreed that 
special curriculum materials should be provided 
for related in-school instruction and believed that 
students should receive high-school credit for co- 
operative work-education. The viewpoints ex- 
pressed by administrators varied widely with re- 
gard to students who are enrolled in cooperative 
work-education programs receiving both wages 
and high-school credit. 

The viewpoints of school administrators indi- 
cated that they believed that a qualified director 
or coordinator should be employed to administer 
the trade and industrial program of the school. 
The administrators favored the establishment of 


advisory committees for trade and industrial edu- 
cation. Their opinions, however, suggested that 
they were uncertain about the composition of such 

committees. While 85 percent agreed that employ- 
er groups should be consulted for advice, only 56 

percent believed that union and em ployee groups 

should be consulted. 

The viewpoints expressed by school administra- 
tors regarding the qualifications and competencies 
of trade and industrial teachers were similar to 
those expressed regarding other vocational teach- 
ers. 


Significant Differences 


The findings shown in Tables IV - VII of this 
study reveal that 90 statistically significant differ- 
ences occurred at either the . 01 percent or .05 
percent level. These differences were distributed 
as follows: 


1. Seventeen when the viewpoints of city super- 
intendents were compared with county sup er- 
intendents. 


2. Twenty-three when the viewpoints of city 
principals were compared with county prin- 
cipals. 


3. Six when the viewpoints of county superin- 
tendents were compared with county princi- 
pals. 


4. Five when the viewpoints ofcity superintend- 
ents were compared with city principals. 


These data suggest that there is greater agree- 
ment among city administrators and among county 
administrators respectively than when city admin- 
istrators are compared with county administrators. 

When the viewpoints of principals having had ex- 
perience with vocational education were compared 
with those not having had such experience, it was 
found that significant differences were found for 39 
items. It should be further noted that 34 of the 90 
Significant differences occurred for items dealing 
with vocational teacher competencies. Of the 90 
Significant differences, 42 occurred for items deal- 
ing with trade and industrial education. With few 
exceptions, those princ ipals having experience 
with vocational education expressed the more fa- 
vorable viewpoint. 


SEC TION Ш 


SUMMARY AND GENERALIZATIONS 


This study was made as an effort to discover 
the viewpoints of Secondary school administrator? 
with regard to pertinent questions regarding voca" 
tional education in the Secondary school. The in^ 
quiry focused attention upon those principles, pol- 
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icies and practices which are most acceptable 
and least acceptable to school administrators and 
furnishes a basis for re-examining vocational ed- 
ucation in the secondary schools of Indiana. Al- megnlatedecel school ymagrams af xocatinmal 
though these data are based upon the viewpoints education. 

of Indiana school administrators, it is possible 7. The time allotment requirements for class 
that the findings suggest implications to be con- instruction is a policy which administra- 
Sidered for vocational education programs else- tors believe should not be determined by state 
where. and/or federal agencies. 

The data-gathering instrument was a question- 8. School administrators favor having the state 
naire which included 112 items distributed as fol- department of public instruction determine 
lows: (1) 20 items on vocational education, (2) 30 standards for approving the facilities for lo- 
items on vocational agriculture, (3) 30 items on cal departments of vocational education. 
vocational home economics, and (4) 32 items on 9. School administrators are not strongly op- 
Vocational trade and industrial education. The posed to the use of federal funds for vocation- 
Questionnai i rtunity for respond- al education. 
ents to naye = ask espera ma y agreed, 10. School administrators believe that vocational 
agreed, were undecided, disagreed, or strongly education courses should be elective. 
disagreed. The data collected was treated to show 11. School administrators donotbelieve or are un- 
Percentages of agreement, uncertainty and disa- certain about the responsibility pes ma 
Ereement with items in the questionnaire found dary school for providing yer : e i ш 
among city superintendents, city high school prin - qm forout-of-school youth an 
cipal i high adults. 
silos E Rn ишп екан 12. School administrators do not believethat 
Were tested i dis à af snificant differences bright students should be discourage from 

at might exist bin seconds school adminis- taking vocational education courses. —— 
ators wh d i f position held. 13. Opinions between city and county school admin 
On the Panis ei the | Hee viewpoints istrators regarding the socio-e c onomic level 


e i i i iza- of pupils enrolled in vocational education 
ee n Hiis dir Bo ES poms x aine are significantly different. 


io 

CREDIS жалу 14. School administrators view teachers of voca- 

tional subjects as being comparable pa othe r 

; idi teachers with respect totraining, pro: геѕѕіопа! 

| = administrators believe that КОЛ an attitudes and inpr ced to cooperate in school 
i eo Jes tor vocation Е апа community activities. 

ao responsibility of забита но 15. School айт inistrators agree that home апа 


6. There seems to be a lack of understanding 
among school administrators regarding the 
extent to which state and federal authorities 


tr 


tion. 

* Superintendents and secondary school princi- 
Pals believe that successful programs of vo- 
Cational education depend toalarge extent up- 
9n the degree to which they encourage an 
Support the program. 

i aT € school administrators ү 

aving а key role in the deve 
Cationa] pierde programs, they favor [o4 
ing local advisory committees appointe Бе 
Counsel with school administrators and teac 
ers of vocational subjects. "S 
` The cost of vocational education courses he 
though higher than for most subjects, 15 a 
Sidered justifiable by the majority oÍ scho! 
administrators, 
mu adm inistrators do not bel 
lonal education programs crea 
Ministrative problems. 


themselves 
ment of vo- 


ieve that vo- 
te too many 


16. 


17. The viewpoin 


18. School admin 


farm supervisory visits are an integral part 
of the instructional program in vocational ag- 
riculture and vocational home economics. 
School administrators view the Future Farm- 
merica and Future Homemakers of Aw 
merica organizations as desirable co-curricu- 
lar activities for schools maintaining depart- 
ments of vocational agriculture and homemak- 


ing. 


ers of A 


ts expressed by secondary school 
administrators with regard to vocational trade 
and industrial education appear to be more 
closely associated with rural and urban factors 
than by the type of position held. | 

istrators having had experience 
al educationprograms tend to hold 


i ion: 
еа ан ts than those not hav- 


more favorable viewpoin 
ing had such experience. 
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THE EFFECTS OF AN INTRODUCTORY 
COURSE IN CHILD DEVELOPMENT ON THE 
ATTITUDES OF COLLEGE WOMEN 
TOWARD CHILD GUIDANCE: 


JAMES WALTERS** 
Florida State University 


THE PRESENT study presents evidence as to 
(a) the difference in attitudes among a selected 
group of undergraduate college women with refer- 
ence to the guidance of children, and (b) the effect 
upon these attitudes of an introductory course in 
child development and guidance examined in rela- 
tion to socio-economic status, intelligence, size 
of family, ordinal position, academic achievement, 
and perception of childhood happiness. 


The Subjects 


The students who served as subjects of the in- 
vestigation were 156 majors in the Division of 
Home Economics at Oklahoma Agricultural and 
Mechanical College. The students were divided 
into two groups, an experimental group and a con- 
trol group. The experimental subjects were en- 
rolled in an introductory course in child develop - 
ment and guidance. The control subjects had not 
completed or were not currently enrolled in this 
Course. A summary description of the subjects 
15 presented in Table I. 

The criteria for the selection of subjects were 
as follows: (a) white, (b) single, (c) reared in the 


United States, (d) 17-24 years of age, (e) female, 
home economics major, and (g) American 
Examination 


Council on Education Psychological 
COmpleted, Students who had completedthe course 
9r who were enrolled in other child development 
and guidance courses were excluded from this in- 
Vestigation because it was believed that their re- 
SPonses might introduce a bias into the results. 
80 A comparison of the 76 experimental аш a) 
Control subjects examined in relation to (a 
crholastie aptitude as measured by the Americaa 
( Ouncil on Education Psychological Examination, 
Year in school, (c) socio-economic status iS 
5 авигеа by the McGuire-White Index of Soci 
tatus (Short Form), and (d) academic achieve- 
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ment as measured by freshman grade point aver- 
age, evidenced no statistically significant differ- 
ences. 


Description of the Instruments and Rationale 
of Method 


Of the instruments which have been designed to 
assess attitudes concerning the guidance of chil- 
dren there are two of a paper-and-pencil nature 
whose adequacy seemed sufficient to warranttheir 
use in this investigation. They are the University 
of Southern California Parent Attitude Survey de- 
veloped by Shoben (10) and the Child Guidance Sur- 
vey developed by Wiley (14). 

In the initial construction of the University of 
Southern California Parent Attitude Survey by Sho- 
ben, a scale of 148 items revealing attitudes con- 
cerning the guidance of children was presented to 
a group of 100 white, urban mothers, 50 of whom 
had problem children and 50 of whom had non-prob- 
lem children. The “problem'' group consisted of 
children who (a) were receiving clinical help for 
some personality or behavior problem, or who 
(b) had come into the custody of the juvenile author- 
ities at leasttwice, or who (c) had a problem about 
which the child's mother had registered a com- 
plaint indicating that she wouldlike to have clinical 
help with her child if it were available, or if she 
could afford it. The ‘‘non-problem’’ group consist- 
ed of children who (a) had never received clinical 
attention, who (b) had never been taken into 
custody by juvenile authorities, and who (c) had no 
problem for which, in the opinion of the mother, 
clinical help was either desirable or necessary. 

In Shoben's preliminary inv estigation those 
items which differentiated the two groups of 
mothers at the five percent level of confidence or 
beyond were retained. As a result of this proce- 
dure, 85 of the original 148 items were retained. 


e University. 
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TABLE I 


DESCRIPTION OF SUBJECTS 


Experimental Control 
Description Classification (N = 76) (N = 80) 
Year in School Sophomores 67 66 
Juniors 9 14 
Ordinal Position Only 11 14 
Oldest 21 28 
Middle 19 19 
Youngest 25 19 
Number of Children 
in Family Two or less 37 39 
More than two 39 41 
Index of Social Status Upper 3 5 
Upper-middle 33 33 
Lower-middle 31 38 
Upper-lower 9 4 
Academic Achievement Freshman grade-point average 2. Tt 2.67 
A.C. E. Score Mean 96.72 95.75 
Residence Rural 38 34 
Urban 38 46 
Childhood Happiness 
Rating Very happy 45 41 
Happy 12 24 
Average 16 14 
Unhappy 3 1 
Very unhappy 0 0 
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The items appear in three subscales, i.e., Ignor- 
ing, Possessive, and Dominant. The Survey was 
thengivento 40 mothers, equally divided between 
the problem and non-problem categories. The 

amount of shrinkage in terms of the magnitude of 

the correlation coefficients which serve as indices 
of the Survey's validity was not excessive, and the 

measures of validity obtained from the second ad- 
ministration were as follows: Ignoring, .624; Po- 
sessive, .721; Dominant, .623; and for the Total 

Scale, .769. 

The Child Guidance Survey is a scale consist- 
ing of 160 items designed to assess attitudes con- 
the guidance of children. The Survey is composed 
of eight parts: (a) general home standards; (b) verb- 
al behavior; (c) expression of hos tility; (d) wean- 
ing, thumb-sucking, and feeding; (e) toilet train- 
ing; (f) sexual behavior; (g) boy-girl differences; 
and (h) crying. 

Utilizing the responses of 172 subjects, meas- 
ures of reliability were obtained by Wiley (14) for 
each of the first seven parts of the Survey. In 
each instance the measure obtained was above .80. 
A measure of reliability was not obtained for the 
eighth part, i.e., crying, because of its small 
number of items. 

In order to obtain a measure of the validity of 
the instrument, clinical judgments concerning the 
“sophistication” of the groups taking the test 
were made. For example, it was believed that if 
the test measured what it purported to measure, 
that experienced clinicians, persons who had been 
counseled with regard to their children's prob- 
lems, and persons who had had the advantage of 
Special instruction in child development would be 
likely to express attitudes which were more favor- 
able than would those who had had little experi 
ence with children. The group of 172 subjects 
Whose responses were analyzed for this portion 
of Wiley's study tended to bear out this hypothesis. 
A detailed description of the methodology em- 
Ployed in the validation of this instrument has 

een presented elsewhere (14). 
The problem of whether à 
Paper-and-pencil nature are suff . А 
Огу {о warrant their use in a serious miei 
tion is not new. The view is frequently expresse 
that questionnaires assess surface apo i 
and that many unconscious forces areleft шаре 
Y the use of such a technique. Yet the use of se- 


called depth techniques is not within the limits of 


Many seri i ‘ cations of attitudes. Another 
š y Serious investigation: eee satisfactor- 


¿important question which has no À 

y answered cen the relative merits of the 
Paper-and-pencil questionnaire and the si 
Te the advantages of the interview sufficien М 
Warrant its use in the large por tion of researc 


Studies in which paper-and-pencil questionnaires 
x used? Ina study by Stouffer, et al. (11) ie 
wennaires and closed-end personal interv! 

e 


Te found to yield nearly identical information. 


ttitude tests of a 
iciently satisfac- 


The work of Metzner and Mann (7) and Kahn (3) 
also indicate remarkable similarity between re- 
sponses obtained by questionnaires and by open- 
end interviews. When a difference was observed, 
it was found that the responses to the question- 
naires were more highly predictive in terms of 
overt behavior than were responses obtained by 
the interview. But it must be recognized that 
many investigators believe that the use of the inter- 
view is superior to paper-and-pencil techniques 
in attitude measurement. In fact, the interview 
is not infrequently used as a criterion for rating 
the validity of questionnaires (1,13). However, 
Nye (9) indicates that the clinical interview has 
been uncritically accepted asa perfect instrument 
while the reliability and validity of paper-and- 
pencil tests have beencritically checked. Recent 
research by Kelly and Fiske (4) offers evidence 
which indicates the fallibility of the human observ- 
er in an interview setting. Too, the cost of the in- 
terview is such that in many instances itis prohib- 
itive as well as the fact that there are many in- 
stances in which it is not expedient to use the inter- 
view. For example, in studies measuring the ef- 
fectiveness of educational programs, asthe study 
herein reported, it is important for all of the sub- 
jects to be tested immediately prior to the onset 
of the experiment. In such instances, question- 
naires are frequently used (2, 5, 8). 


Administration of the Instruments 


The University of Southern California Parent 
Attitude Survey and the Child Guidance Survey 
were administered to the students prior to the be- 
ginning of classes in September, and after the 
course endedin January. Although the instructors 
of the various experimental sections were aware 
that their students were participating ina research 
study, they were not given details of the investiga- 
tion, nor were they aware of the nature of the in- 
struments being used. The instructors were not 


present at the testing sessions. 
In order to conceal the identity of the scales, 


when they were mimeographed for student use, 
they were designated only as Inve ntory À and In- 
ventory B. Several students who were unable to 
complete the scales at the designated time com- 
pleted them by special appointment. All ofthe 
scales were machine scored. The scales were ad- 
ministered by the writer and by other staff mem- 
bers not engaged in teaching the introductory child 
development and guidance course. 


The Experimental Program 


course in child development 
d guidance required of all students majoring in 
Te Siod of Home Economics at Oklahoma A. 
and M. College constituted the educational pro- 
gram to which the experimental group was ex- 


The introductory 
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posed. Each week inthecourse, two fifty-minute 
periods were devoted to theory, two periods to 
laboratory, and one fifty-minute observation in 
one of the college nurseryschool-kindergarten 
programs is required of each student. Enroll- 
ment in the sections is maintainedat approximate- 
ly 25 students. 

The class laboratory ses sions are devoted to 
Such experiences as showing films; discussing 
the individual childrenandtheir families with the 
nursery school teacher leading the discussion; an- 
alyzing case studies of children of preschool age; 
experimenting with creative media such as finger 
paint, clay, and easel paint; listening to chil- 
dren’s records; and reading and discussing chil- 
dren’s books. 

A rich variety of experiences is afforded the 
students in the nursery school-ki ndergarten pro- 
grams. Although the role thestudents assume is 
primarily that of observers, the students are in- 
cluded in activities whenever possible andare pro- 
vided such experiences as setting the tables for 
lunch, helping to prepare lunch, and eating with 
the children. Too, the students not infrequently 
are allowed to assist in taking the children on 
field trips; to help with projects when help is re- 
quested by the children; to read stories to the chil- 
dren; to play records; to prepare materials 
such as mixing finger paint, easel paint and dough 
clay; and to dress the children following their 
rest period. 

The text which was used in the course was the 
fifth edition of Rand, Swee ny, and Vincent’s 
Growth and Development of the Young Child pub- 
lished by W. B. Saunders Company in 1953. Many 
supplementary readings were also used. 

The course is taught from a preparental ap- 
proach and emphasizes everyday problems, rou- 
tines, and typical responses of young children. 

In terms of course content, consideration is 
given to such factors as (a) basic needs and devel- 
opmental tasks of young children, (b) principles 
of development, (c) principles of guidance, (d) 
parent-child relationships, and (e) learning. At- 
tention is given to creative media suitable for 
young children, to the emotional and social devel- 
opment of young children, and to the importance 
of the family as a socializing influence. Such 
films as The Terrible Two's and the Trusting 
Three's and The Frustrating Four’s and the Fas- 
cinating Five’s are used. 

Classes are designed so that an exchange of 
ideas is encouraged. Classes are friendly and in- 
formal, and students are given considerable psy- 
chological support to encourage freedom of ex- 
pression of ideas for group consideration. Atten- 
tion is given to the application of research find- 
ings to everyday problems rather than to an anal- 
ysis of the values of spec ific methodology of re- 
search relating to children. The focus of the class 
is on everyday happenings, normal development 


of children, and common forms of guidance. Little 
attention is given to the technical aspects of learn- 
ing theory; rather, considerationis given to learn- 
ing as observed through the eyes of parents. 
Because of the bias which might have been in- 
troduced into the data had the investigator taught 
one of the sections of the introductory course in 
childdevelopment and guidance, he was relieved 
of his teaching responsibility for the course while 
the investigation was in progress. Three instruc- 
tors assumed the responsibility of teaching four 
sections of the course comprising the experiment- 
al treatment. Although the instructors were aware 
that their students were participating in a research 
study, they were not given the details of the inves- 
tigation, nor were they aware of the nature of the 
instruments being used in the evaluation. 


Results 


The initial mean scores obtained on the USC 
Parent Attitude Survey and on the Child Guidance 
Survey by 156 home economic majors classified 
according to various subgroups are presented in 
Table II; the differences between mean scores of 
experimental and control groups, in Table III; and 
the changes in scores between final and initial 
tests are presented in Table IV. 

The data indicate that the responses of the sub- 
jects on the USC Parent Attitude Survey were not 
unlike those obtained by Shoben’s (10) parents of 
“non-problem’’ children. The responses of the 
subjects on the Child Guidance Survey were not un- 
like those obtained by Wiley’s (14) students in ed- 
ucation and public speaking classes, that is, his 
relatively ‘‘unsophisticated’’ subjects. 

The differences bet ween the experimental and 
control groups at the onset of the experiment ap- 
peared to be small. As stated previously, differ- 
ences with respect to scholastic aptitude, year in 
School, So cio- economic Status, and academic 
achievement were not statistically significant. Nor 
Was there a significant difference between the 
mean scores obtained by the experimental and con- 
trol groups on the USC Parent Attitude Survey: 
However, atthe onset of the experiment the differ- 
ence between scores obtained on the Child Guid- 
ance Survey by the experimental and control 
groups was significant at the one percent level of 
confidence, indicating more favorable attitudes on 
the part of the experimenta] subjects at the time 
of the initial testing, A Consideration of equating 
the two groups by an arbitrary exclusion of certain 
Subiecte was abandoned because of the possibility 
Instead ga farlance, thus biasing the results: 

, S decided to view the changes in ге- 
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TABLE II 


INITIAL SCORES OF STUDENTS 


Group 


Experimental 
Control 


Upper-Middle 
Lower-Middle 


ACE: 50 percentile or above 
ACE: Below 50 percentile 


Rural 
Urban 


Two children or less 


More than two children 


Only 
Oldest 


Only 
Middle 


Only 
Youngest 


Oldest 
Middle 


Oldest 
Youngest 


Middle 
Youngest 


GPA: 3 pt. or above 
GPA: Below 3 pt. 


Very happy 
Happy 


Very happy 
Average 


Happy 
Average 


USC Parent Attitude 


Child Guidance 


Survey Survey 
Initial Level of Initial Level of 
N Mean Confidence Mean Confidence 
16 333.7 429.3 
80 338.4 449.7 .01 
66 337.7 436.9 
69 333.4 440.1 
53 331.5 3s 433.2 
103 338.5 й 443.2 
72 335.6 440.2 
84 336.6 439.4 
76 337.2 444.4 t 
80 335.1 458.3 К 
25 329.2 - 435.1 
49 337.9 е 445.2 
25 329.2 435.1 
38 335.7 439.8 
25 329.2 435.1 
44 338.4 436.4 
49 331.9 445.2 
38 335.7 439.8 
49 337.9 mee 
44 338.4 288 
38 335.7 i 
44 338.4 
50 333.3 ine 
106 337.5 s 
44.2 
86 338. 7 Res A 
36 339.1 И 
444.2 
86 338.7 . 
30 325.0 ш 429.1 
441.4 
36 339.1 ol 
50 325.0 таш 
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groups on both instruments, suggesting that ma- 
turity and/or factors which were uncontrolled in 
the investigation may have contributed to changes 
in attitude. Although there was a difference of 
approximately 20 points at the time of the initial 

testing between experimental and control groups 

on the Child Guidance Survey infavor of the exper- 
imental group, at the end of the sem ester there 

was a difference of approximately 50 points be- 
tween experimental and control groups, indicat- 
ing significantly greater gains for the experiment- 
al group. 

Socio-Economic Status — The differences ob- 
tained between means of 66 upper-middle and 69 
lower-middle class students on the USC Parent 
Attitude Survey and on the Child Guidance Survey 
were not statistically significant. The numbers 
of students in the upper class and in the upper- 
lower class, in the present study were too small 
to warrant other comparisons. 

A difference between the initial means obtained 
on the Child Guidance Survey by the experimental 
ànd control subjects of the upper-middle class is 
Significant at the one percent level of confidence, 
the experimental subjects having obtaineda score 
reflecting more favorable attitudes conc er ning 
the guidance of children. 

A comparison of the changes in scores between 
initial and final tests of experimental and control 
&roups in the upper-middle and lower-middle so- 
Cio-economic levels reflects significant gains In 
the upper-middle and lower-middle classes in the 
Control group as evidenced by responses on the 
USC Parent Attitude Survey. The responses of 
the experimental subjects on the Child Guidance 
Survey, however, reflect gains significant at the 
One percent level of confidence in both the upper- 
middle and lower-middle socio-economic groups. 

е control subjects in the upper-middle class 
evidenced a gain significant at the five percent 
vel of confidence. ü 

Scholastic Aptitude— The American Counci 


on Education Psychological Examination which the 


Students com plete upon their entrance to Okla- 
lized as an index 


oma A. and M. College was uti 
fi Scholastic aptituds. Students ranking at the 
ae percentile or above on the A.C. E. men 

On € Obtained a significantly better aum Айе. 
de € USC Parent Attitude Survey than п 

nts below the fiftieth percentile. Responses 
oe hild Guidance Survey did not reflect a com- 
^ епс, — " 
istically significant differenc 
ween mean ici obtained on both instruments 


x i dents at the fif- 
Perim ental and control stu E. Examina- 


were noted 


e 
tie 
Sp Percentile or above on the A. C. E- more 
avo. the experimental subjects indicating f chil- 
dre, ble attitudes concerning the guidance 0 was 
под we Significant difference, pt on 

etw HOS res O 
ееп the initial mean SCOTS 4 control 


е = 
"IE? instrument by experimental an 


subjects who were below the fiftieth percentile on 
the A. C. E. Examination. 

The changes in scores between initial and final 
tests of experimental and control groups of differ- 
ent levels of scholastic aptitude in general indi- 
cate significant gains in both experimental and con- 
trol groups irrespective of scholastic aptitude. 

Rural-Urban Residence— The initial mean 
scores obtained by students from rural and urban 
areas do not reflect a significant difference be- 
tweenthe two groups with respect to attitudes con- 
cerning the guidance of children. 

A comparison of the initial mean scores ob- 
tained by students from rural and urban areas in 
the experimental and control groups reveals no 
statistically significant differences on the USC 
Parent Attitude Survey. On the Child Guidance 
Survey, however, in both the rural and urban 
groups the experimental subjects obtained signifi- 
cantly lower scores, indicating more favorable 
attitudes concerning the guidance of children. 

The changes inscores between initial and final 
tests of experimental and control groups in gener- 
al indicate significant gains in both groups irre- 
spective of rural-urban residence. 

Size of Family— The difference between means 
of students who were reared in families of two or 
fewer children and students who were reared in 
families with more than two children were not sta- 
tistically significant on the USC Parent Attitude 
Survey. A differenceinmean scores оп the Child 
Guidance Survey, however, is significant at the 
five percent level of confidence, the students from 
smaller families indicating more favorable atti- 
tudes. 

The differences obtained on the Child Guidance 
Survey between the students in the experimental 
and control groups who were reared in families 
with more than two children is si gnificant at the 
one percent level of confidence, the experimental 
subjects evidencing more favorable attitudes con- 
cerning the guidance of children. : 

In general, the changes in scores on both instru- 
ments between initial and final tests of experi- 
mental and control subjects from families with 
two or fewer children and from families with more 
than two children indicate significant gains in both 
the experimental and control groups irrespective 
of the number of children in the family. 

Ordinal Position—Little relationship was noted 
between ordinal position and attitudes concerning 
the guidance of children. | The initial mean scores 
obtained by students of different ordinal = 
indicate only one statistically significant di à 
ence. Onboth the USC Parent Attitude Survey an 
the Child Guidance Survey significant а. банта 

obtained between experimental and control 
petite ho were oldest children in their fami- 
a Exp rimental subjects evidencing the 
mad Tacora bie attitudes concerning the guidance 


of children. 
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An inspection of the changes in scores on the 
USC Parent Attitude Survey between initial and 
final tests of experimental andcontrol subjects of 
different ordinal positions fails to reveal a super- 
iority with respect to gain in the experimental 
group. On the Child Guidance Survey, however, 
only children, middle children, and youngest chil- 
dren in the experimental group evidenced signifi- 
cant gains while the control group subjects in these 
positions did not. 

Academic Achievement— The grade-point aver- 
age which the subjects earned during their fresh- 
man year incollege was utilized as an index of ac- 
ademic achievement. When the scores of the stu- 
dents with a grade-point average of 3 or above (a 
3 point being equivalent to a grade of B) were 
compared with those with a grade-point average 
below 3, no significant differences were noted 
With respect to attitudes concerning the guidance 
of children. 

No significant differences were noted between 
experimental and control groups for those stu- 
dents whose freshman grade-point average was 3 
orabove. Statistically significant differences, 
however, wereobtainedbetween experimental and 
control groups for those students whose freshman 
grade-point average was below 3, the experiment- 
al group holding more favorable attitudes concern- 
ing the guidance of children. 

The changes in scores between initial and final 
tests of experimental and control groups of differ- 
ing levels of academic achievement indicate, in 
general, significant gains in both the experiment- 
al and control groups irrespective of academic 
achievement as measured by the grade-point av- 
erage attained at the freshman level. 

Perceptions of Childhood Happiness— The ini- 
tial mean scores obtained on the USC Parent Atti- 
tude Survey by students with different perceptions 
concerning the happiness of their own childhood 
indicate that students who perceive their childhood 
to have been ‘‘average’’ hold attitudes concerning 
the guidance of children which are more favorable 
than do students who perceive their childhood to 
have been '*happy'' or “very happy." Responses 
to the Child Guidance Survey, however, do not re- 
flect such differences. In general, the data do 
not reflect significant differences between experi- 
mental and control groups. 

The changes in scores between initial and final 
test of experimental and control groups with differ- 
ent perceptions of childhood happiness indicate that 
there is little consistency between findings of the 
two instruments, with the exception of those stu- 
dents who rated their childhood to have been 
“very happy.” Thegains made by both the exper- 
imental and control subjects who rated their child- 
hood to have been ''very happy"! were statistically 
significant. 

The validity of the happiness ratings obtained 
at the beginning of the semester is doubtful. When 


the responses on the happiness rating scales ob- 
tained at the beginning of the semester were com- 
pared with those obtained at the end of the semes- 
ter, the percentage of agreement bet ween the re- 
Sponses of the experimental group was . 64, and 

the percentage of agreement between the re- 
sponses of the control group was . 66. The data 

for the twogroups were treated Separately in this 

regard because it was assumed that the percep- 
tions of the students enrolled in the child develop- 
ment class might change more than those of the 

Students in the control group. The evidence, how- 
ever, does not support this assumption. 


Discussion 


Any interpretation with respect to the causes 
of the changes іп attitudes evidenced in the con- 
trol group must necessarily be regarded as specu- 
lation. There is the possibility that such gain re- 
flects mere maturation. Also, every research 
worker in the social sciences is acutely aware that 
in experimental work only a portion of the signifi- 
cant variables are adequately controlled. The 
problem, in part, is in knowing what factors are 
relevant as well as in being able to control them. 
Inthe present study, for example, even though 
one might logically assume randomization with re- 
Spect to the courses in which the experimental and 
control subjects were enrolled, the fact remains 
that during the time the experimental subjects 
were enrolled in the introductory course in child 
development and guidance, the control s ubjects, 
for the most part, were enrolled inother courses 
carrying approximately the same credit. Undoubt- 
edly, Some of these were home economics courses 
which although not Specifically concerned with 
child development and guidance were “family cen- 
tered” in their approach. This may have contrib- 


uted to the modification of attitudes toward the 
guidance of children, 


imental treatment" 
tory course in child development and guidance 


ured by the USC Parent Attitude Survey, however. 
that the initial scores ob- 
nt Attit ompar 
favorably with those of Panele at aan feci am 
children reported by Shoben (10), and that they 
compare favorably with those of 207 undergradu- 
ate men reported by Walters and Bridges (12). 
,, The present study demonstrates that certain at- 
titudes concerning the guidance of children can be 
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modified during the course of a semester. In the 
opinion of the investigator, it reflects what may 
be achieved in schools throughout the country in 
which there is a sincere desire to promote the 
welfareof children. Since one of the purposes of 
education for family living is the modification of 
attitudes, it would seem that similar assessments 
of the attitudes of young people at the secondary 
levelas well as of men and women in colleges and 
universities might well serve as an important ba- 
sis for curriculum planning. 
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THE PREDICTIVE VALUE OF A TEACHER 
JUDGMENT TEST 


GEORGE JOSEF SCHICK 
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Sacramento, California 


The Problem 


THE PURPOSE of this investigation is to study 
the predictive effectiveness of a teacher judgment 
test with particular reference to grade-point av- 
erage in professional courses, student-teaching 
grades and efficiency ratings by school supervi- 
SOrs given after six months of teaching. The study 
Will concern itself with a so-called judgment test 
of a situational type, wherein certain facts are 
given and various sorts of judgments are sought. 
In making this approachitis thought that teachers 
must make many judgments in teaching and the 
quality of these judgments will correlate positive- 
ly with the effectiveness of teachers. 
The effectiveness of teachers will be measured 
by subjective evaluations th r o ugh the Wisconsin 
Adaptation of the M- Blank by supervisors of 
Schools. More specifically, this study will at- 
tempt to provide the evidence relative to the fol- 
lowing questions: 
1. Are scores on the judgment test predictive 
of success in professional courses? —— 
2. Are scores on the judgment test predictive 
of student-teaching grades? 

3. Are scores on the judgment t 
of supervisors' rating of teac 
of a six month period? 


est predictive 
hers at the end 


Я Three criteria will be employed for predicting 
aching success. They are: (а) grade-point av- 
‘Tages in professional education courses, (b) stu- 
®nt-teaching grades, and (c) princ i als' or Su- 
ee Visors’ judgments on the effectiveness of е 
#78. During this investigation, subjective ан © 
Ctive data will be used. It is hoped that these 


data сап be used for predicting future teac hing 
Uccegs, 


Im 
Portance of the Problem 


is by far 


The cur chers 
Sr rent demand for tea qus hasiat 


Sater than the supply available. 


* 
^n footnotes will be found at end of article. 


always been true. Twenty years ago a superintend- 
ent of schools could select from a large group of 
candidates, who applied for the teaching position, 
the one he thought would be best fitted for the job. 
This meant that the superintendent could select 
the candidate that promised to develop into a suc- 
cessful teacher, based on superior capacities, 
training, and apparent potentialities. Because it 
is true that most graduates of a teacher-education 
institution can find teaching jobstoday, it is a ne- 
cessity that communities, schools, and teacher- 
education institutions assume joint responsibility 
to admit only those candidates for a future teach- 
ing career who possess superior qualifications, 
outstanding character traits and favorable person- 
al qualities. Herbert I. Von Hadenl com mented 
on this subject in the following manner: “Although 
the present under supply of teaching c andidates 
may make it appear that pretraining s election is 
not as urgent as it was when the supply of certifi- 
cate holders was more abundant, it might be ar- 
gued that existing conditions make it even more 
necessary than formerly for training institutions 
to exercise greater care in the selection of their 
students. Certainly the ultimate welfare of the 
boys and girls who will come under the guidance 
and tutelage of the candidates in a future teaching 
position is fundamental and paramount. Despite 
the present critical shortage, those who are re- 
sponsible for long-term planning in education dare 
not lose sight of the need for the improvement of 
the quality of instruction in the schools and of the 
personnel to carry out that instruction. More val- 
id and reliable instruments of selection, then, are 
essential for the protection of boys and girls 
through the general improvement of the lev el of 
the teaching profe ssion. The responsibility for 
assuring the highest possible quality of instruction 
rests jointly with the institutions training candi- 
dates, the agencies entrusted with the certifica- 
tion of teachers, and the administration charged 


with the selection of personnel." 


In 1935, Barr2 emphasized that one of the most 
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effective means for improving the quality of in- 
struction in schools is to admit only those of su- 
perior potential teaching ability. The questions 
then arise: 


1. Can teaching efficiency be predicted? 

2. Is achievement in various educational or 
professional courses an indication of later 
teacher success? 

3. What instruments will serveas an adequate 
basis for predicting success in teaching? 


School programs, teacher placement agencies 
and teacher-education institutions would be aided 
immensely if a valid instrument could be devel- 
oped for predicting teaching success. Because of 
the complexity of the teac hing-learning process, 
however, no one instrument has yet been found to 
accomplish this desired goal. Many attempts 
have been tried to forecast the succ ess of appli- 
cants for teaching positions. The most common 
procedures are to evaluate application blanks, 
questionnaires, college grades, physical and emo- 
tional fitness, and to observe the teacher at work. 
Even after a teacher is on the job, sound instru- 
ments and techniques of evaluating and measur- 
ing teaching effectiveness are of great aid for the 
in-service education of teachers. Materials, pro- 
cedures and methods need to be evaluated from 
time to time to keep abreast of modern develop- 
ments in the teaching field. To do this effective- 


ly, sound measuring devices of teaching ef ficien- 
cy are of utmost importance, 


Review of Previous Investigations 
ааа 


In the past many studies have been made of the 
relationship of certain teacher co mpetencies to 
various criteria of teaching success, Barr? pub- 
lished in 1948 а summary of one hundred and 
thirty-eight such studies. There are five broad 
categories into which these studies can be classi- 
fied: 


1. General investigations. 

2. Pupil growth and change as a mea 
teaching success. 

3. Pupil ratings of the teacher as a Criterion 
of teaching success. 

4. Supervisory ratings of theteacher as ameas- 
ure of teaching success. 

5. Personal fitness of the teacher as measured 
by respective tests in the area of: attitudes 
knowledge of the subject matter, personal- 
ity and temperament. 


Sure of 


The studies which are categorized under 3 above 
and the studies that use judgment or intelligence 
tests of some sort as predictors of teaching effi- 
ciency will be reviewed. Investigations that use 
ratings of supervisors, grade-point average in 


professional courses and student-teaching grades 
as criteria were reported in this study. The m 
merous studies made in the past will not be cite 
here. The readeris referred to Barr's article 
or to George J. Schíck's doctoral dissertation on 
file at the University of Wisconsin library. а 

In the fall of 1955, teacher judgment tests уне, 
given to one hundred and forty-three se ALOR Мя 
rolled in the School of Education at the Universi a 
of Wisconsin. The population of this study 
sists of seventy-two teachers for whom s picis 
ory ratings could be obtained. Since no ae 
sample is used in this study, no statistical "it 
ences will be drawn; the study is a desc r 1p ae. 
predictive investigation. From the original Шы 
hundred and forty-three students, eighty-s MGE 5 j- 
cured elementary or high school teaching ed 
tions for whom supervisory ratings could be asce 
tained. | meu 

Student-teaching grades, grade-point averag ЈА 
іп professional courses, overall grade-point sec 
age in four years of college, and psychological ame 
ination scores were collected and tabulated for 
teachers under consideration. adds 

The *'so-called judgment test" was admi » 
tered to the students. In making this eye 
is thought that teachers must make many J ed 
ments in teaching and the quality of these Tages 
ments will correlate positively with the ee 
ness of the teachers. Judgment is here geen 
the ability to form an opinion or conclusion à -— 
а course of action from circumstances prese 2 
to the person. It isnot essential in this ves Pi 
tion that one establishes the fact that the tes zen 
ployed actually measures judgment since v^ dus 
mary concern is with the predictive value of m 
instrument regardless of what it might clain 
measure. 


Criteria of Teaching Efficiency 
ting Etticiency 


Many criteria of teaching efficiency have aed 
used in the past studies. Von Haden? discuss in 
the criterion of ratings of teaching effi сна ont 
the following way: ‘Тһе evaluation of posae 
upon the basis of information gathered throug ct 
Servational devices often loses sight of the T 
that what the teacher does may not be as PIE i 
cant as how he does it. There is also the poss? ts 
ity that the whole is more than the sum of the par n. 
in its effect upon the outcomes of instructio 
Then too, a specific teaching act may not be gor 
or bad in itself, but rather in relation to the eu 
Situation in which it is used— the conditions whi¢ = 
give rise to it and the Purpose for which it is s 
ployed. This appropriateness factor is frequen 
ly given inadequate Consideration... 


“As in the case of the ac cud 


M mmo 
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position to detect qualities and to evaluate them. 
The problems involved in determining the quali- 
ties to be considered and inobjectifying the instru- 
ments employed in making the evaluations have 
long been recognized. The extent to which out- 
standing proficiency in one respect compensates 
for deficiency in another, has, however, not been 
established... Qualities, like procedures, may 
be closely dependent upon the situations that give 
rise to them as a consequent may have to be 
measured in specific situations. 

«Тһе complexity of the problem of establishing 
a criterion of teaching success is rooted in the 
varied nature of the teacher's workand the conse- 
quent range of qualifications. "' 

While much has been achieved in this area it 
is still quite difficult to control the many factors 
aside from the teacher which influence the learn- 
ing processes of pupils. No highly valid and reli- 
able measures are yet available to measure pupil 
growth. . 

Teacher ratings by supervisors have always 
been under attack as far as their validity and reli- 
ability are concerned. It is clearly r e cognized 
that different supervisors have different criteria 
on what constitutes teaching efficiency in differ- 
ent situations. But aslong as they are employed 
by their local board of education, as long as they 
represent particular communities, and as longas 
they judge what is acceptable or not acceptable in 
teaching, their ratings should be given serious 


consideration. 


in This Study 
Three criteria will be employed in this investi- 
gation, namely, 
_ 1. Grade-point averages as à measure of effec- 
tiveness in professional courses, 
2. Student-teaching grades as 
Pre-service teaching effectiveness, n 
3. Principals’ or supervisors’ judgments 0 
the in-service effectiveness of the teachers i" 
measured by the Wisconsin Adaptation of the M- 
Blank 


a measure of 
and 


These criteria are not without shortcomings, 

дй they constitute three almost universally use 
titeria of professional competency 9 judged n 
* basis of pre-service preparation and early 

“aching experience. 

bris study involves only these 
Claims of this investigation a: 8 

іе General efficiency of teachers but ча 
1 Y as indicated by these criteria. Е їли 
Р be used because proficiency in profe ed 

iu eH is judged important in teaching SU is 
‘terion 2 will be used because student-teaching 


three cr iteria. 
do not pertain to 
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provides a pre-service opportunity to observe the 
teacher in action. Criterion 3 will be used be- 
cause the supervisors’ judgments are of great im- 
portance for the particular situation in which the 

teacher finds himself or herself. 

If principals or supervisors think that certain 
qualities make a good teacher, and if the success 
or failure of teachers depends upon the possession 
of them, then they are ultimately of essential im- 
portance to the teacher. If teachers are hired, 
promoted, evaluated and fired by supervisors who 
made judgments on their teaching efficiency it is 
of little value to the teachers to know that there ex- 
ists a more valid criterion that would reveal their 
superior ability in teaching. 


Statistical Design of the Study 


In the preceding section the data-gathering de- 
vices were explained. These data were analyzed 
through the use of the coefficients of correlation, 
mean, standard deviation, F-test, and in some 
cases, through partial correlation, multiple corre- 
lation and an analysis of variance. The judgment 
tests were scored in three different ways. The 
group of seventy-two teachers for which responses 
could be secured was divided into random half A 
and random half B. Three scoring keys were de- 
rived. One was a “rational” key bas ed upon the 
answers given to the several test items by three 
professors and two advanced graduate students. 
There were two different empirical scoring keys, 
one derived from sub-group À and another from 
sub-group B, using the mean as weights. An addi- 
tional scoring key was derived from group A using 
the mean over the unbiased estimate of the popula- 
tion variance as weights. Ñ 

In the following section, em pirical keying and 
cross-validation will be explained in detail. Cross- 
validation took place whenever akey from one 
group Was derived, the second group was scored 
with it, and these scores were correlated with 
their corresponding M-Blank ratings. The cross- 
validation referred to here was carried out with 
seventy-two cases for whom the data were com- 


"m dred and forty-three seniors 


A total of one hun 
took the judgment test in the fall of 1955. Of these, 
forty-two were not teaching in 1957, fifteen were 


i rmed Forces and eighty-six secured teach- 
a алда, Of the latter, eighty-six who Кз 
teaching seventy-two supervisory ratings окт ss 
Wisconsin Adaptation of the M-Blank coul айк: I 
tained. This constituted eighty-four perce n E 
the teachers for which follow-up letters were writ- 
ten and returns could be obtained. 


Empirical Keying and Cross- Validation | 
It is believed that the use of empirical keying 

and cross-validation should be more an y ар" 

plied in prediction studies of this kind. s pre- 
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viously mentioned, 143 seniors enrolled at the Un- 
iversity of Wisconsin in the fall of 1955 took a 
teacher judgment test. Oneyearlater, when these 
seniors were teachers, follow-upletters were writ- 
ten to their supervisors and superintendents. The su- 
pervisors and superintendents were asked to give 
efficiency ratings of these teachers on the Wiscon- 
sin Adaptation of the M-Blank. Only those teach- 
ers for whom M-Blank ratings could be obtained 
were selected for this study. From the 72 teach- 
ers for whom M-Blank ratings could be obtained, 
random halves were used to derive empirical keys. 

All possible response options were listed for 
each item of the judgment test, including “‘omit- 
ted all options for that item” as an option itself. 
Then, for each option a weight was found. This 
was done by summing all the scores of M- Blank 
ratings given by supervisors of those teachers 
who marked a particular option of the test item. 
This derived sum was divided by the number who 
checked that particular option and thus the mean 
Score of the M-Blank rating of those teachers who 
marked the mentioned option was found. In this 
manner a weight could be found for all marked op- 
tions. 

One would also need to know what weight an op- 
tion for a question should receive if it was not 
checked by anyone in group À but was checked by 
someone in group B. Inorderto answer this ques- 
tion, the following proc edure was applied. The 
weights of all options marked were multiplied by 
the number of cases upon which they were based. 
The sum of all these products was then divided 
by the total number of cases, which made up the 
various option weights in a particular question. 
This newly derived weight was used for any and 
all options that were not checked. Thus, if option 
‘tomit’ for a question had no weight, meaning 
that everyone checked some other option for that 
question, this newly derived weight was used for 
it. Why do weneeda weight for an option “omit” 
of a question if nobody really omitted the ques- 
tion? Itis true thatfor that group from which the 
scoring key was derived a weight for an option 
**omit" was not necessary. But we scored with а 
key derived from one group—a second group— 
and for that second group, a possible answer to 
the above question was ‘‘omit’’ and a weight must 
be available if anyone in the second group actual- 
ly omitted that question. 

Random half A was used first in deriving the 
empirical scoring key using the mean as weights. 
Both groups A and B were scored with this key. 
Since the key was derived from group A, the 
scores of group Acontainedabias. The unbiased 
scores of group B were used for cross-validation. 
The second half was then used in deriving a differ- 
ent empirical scoring key and the first half was 
used for cross-validation. This process is known 
as double cross-validation and was used in this in- 
vestigation. The two halves are then independent 


of each other and a new group will not have to be 
found for cross-validation. This procedure saves 
time as generally a y ear is required before it is 
known who the new graduates are and which gradu- 
ates could be used for cross-validation purposes. 
Moreover, new conditions, arising during the year 
following establishment of an empirical key, might 
nolonger make the two groups of graduates com- 
parable. The above procedure was achieved by 
Scoring the second half with a key derived from 
thefirst random half and correlating these scores 
with their corresponding M- Blank ratings of super- 
visors. This coefficient of correlation was called 
rp. The biased scores of group А were also cor- 
related with their corresponding M-Blank ratings 
and the coefficient of correlation was called гд. 
The prime indicated it was biased and the sub- 
Script identified the group. Thus, rp was the co- 
efficient of correlation of the cross-validated 
group B, or the unbiased validity measure for group 
B. A coefficient of equivalence—a measure of re- 
liability —was found in using Cronbach's alpha for 
both groups. 

By looking at the various validity measures rA; 
rA; TB, rp, which were actually computed one 
could ask the questions: What is the validity meas- 
ure for the combined biased groups? What is the 
actual bias of group A that resulted when group A 
was scored witha scoring key derived from group 
A as compared withthe unbiased, cross-validated 
value of rA? What is the actual bias of group B 
that resulted when group Bwas scored with ascor- 
ing key derived from group B as compared with 
the unbiased, cross-validated value of rp? 

In order to answer these questions it is evident 
that one cannot simply add the two cross-validated 
coefficients of correlation and divide by two. They 
are based on different groups, with different means, 
and standard deviations. To combine гд and rB 
to one estimate, Fisher's z-transformation was 
used. The reason for this transformation was a9 
FisherÓ points out: ** The transformation leads aP- 
proximately to a normal distribution. The advan- 
tage of this transformation lies in the distribution 
of the two quantities in random samples. 
Standard deviation of r depends on the true value 
of the correlation and the standard error of z i$ 
oe independent of the value of the correla- 

n. 

“In the second place the distri bution of r ÍS 
"pu. Small samples, and even for large (c 
pe E: remaina very skew for high севген 
on Stribution of z is not strictly normal, but " 

nds to normality rapidly as the sam ple is in 
aon ee may be the value of the E 
: imple assumption that z is normally 


distributed will in all ordinary cases be sufficient- 
ly accurate.” 


A combination of 
validity coefficients 
value was called Ep 


the two independent unbiased 
TA ànd rp were found. Thi 


This was also computed for 
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=P 
TABLE II | 
JUDGMENT TEST DATA SCORED WITH AN EMPIRICAL KEY 
USING THE MEAN AS WEIGHTS 
Group KeyA Keyp Student Sums 
pl °°: S ooo 
Sum 43771 + 40912 = 84683 ' 
Squares 53232297 + 46496482 # 199217739 
Mean 1216 1136 1176 
Standard 
Deviation 19.15 8.05 
Group B 
Sum 43786 + 41058 = 84844 
Squares 53261712 + 46835856 # 199974230 | 
Mean 1216 1140 1178 4 
Standard 
Deviation 12.84 16.27 Р 
Overall Sum ёт — « 8190  - 69527 - 
Overall Squares 106494009 + 93332336 £ 399191969 
Overall Mean 1216 1138 1177 " 


Overall Standard 
Deviation 16.20 12.87 
Biased or Sequence 


Sum 84829 84698 
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TABLE IV 


BIASED AND UNBIASED SCORES OF THE J UDGMENT 
TEST USING THE MEAN AS WEIGHTS 


Scoring with Key Scoring with Key 
Derived Empiri- Derived Empiri- 
cally from Group A cally from Group B 
к =  . 
Group À Biased Scores of Unbiased Scores of 
Group A Group A 
Group B Unbiased scores of Biased Scores of 
Group B Group B 


кеш = м... 


TABLE V 


BIASED AND UNBIASED COEFFICIENT OF CORRELATIONS SCORING THE JUDG- 
MENT TEST IN USING THE MEAN AS WEIGHTS 


MN re 


M- ү 
of Group А Blank Ratings 


of Group B 


Biased Scores 


of Group A Fy 


Unbiased Scores 
of Group A 


Biased Scores 
of Group B 


2 B 
Unbiased Scores 
of Group B 


Mn DE ы ЛИНА 


E 3 
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the two biased ‘‘validity’’ coefficients гд and rp. 
The improved estimate was called rg. The differ- 
ences r'A- гд and гв - rp, the actual bias were 
also calculated. The corresponding reliability 
measures, сд, %'A, XB, апа хв were also 
computed. 

Different means and standard deviations of bi- 
ased and unbiased scores were found. How do 
these different measures compare with each other? 
Is there a difference between groups, keys or bi- 
ased and unbiased scores? Toanswer these ques- 
tions a composite analysis of variance of the judg- 
ment test scores with an empirical key using the 
mean as weights was computed. This method was 
adapted to the procedure used by Stanley”. 

The procedure was used for the two r andom 
halves of group A and group B. Taking asubject 
at random from group A and one from group Bre- 
sults in the simplelatinsquare. As Stanley points 
out: “There will be half as many such latin 
squares as there are subjects inthe investigation, 
or as many as there are individuals in either 
group. It is more convenient to setup the scores 
for analysis in the form of Table 1, where the two 
groups are kept separate. This is a consolida- 
tion of n latin squares. "' 

Table II gives all the scores of the two groups. 
Table Ш shows the procedure of analysis. For 
more detail the reader is referred to Stanley’s 
article. 

It was found, as shown in Table H 
was no difference between groups an 
ence between individuals within groups. But the 
key derived from group A versus the key derived 
from group B showed a very large significance. 
Immediately the question then arises why should 
that be the case if two random halves were used 
In determining the two keys? To answer this 
question let us look at the frequency polygons of 
Scores of the two keys and one can readily see 
that the two distributions overlap only very little 
(in fact, only one score does). 

An F-test was computed an r 
means between the two group distributions of 
Scores was found to be highly significant. The 
Variances for the two keys were tested previous- 
y for homogeneity and also were found to be sig- 
сані at the опе percent level f o r the cross- 
ma dated groups as repor ted earlier. But why 

İS should be so is, as yet, not explained. Key 
Tesulted in much higher judgment test scores 
ап key B, The data indicated that key À pts 
Be ad mean and standard deviation of umero 
of nas than did key B. Is this difference eem 

(n Supervisory rating statistically me ‘Suna 
б i applied and the two means ни (t= 

647 ет appreciably but not ee 

in 70 degrees of freedom.) E тама 
u Wii. of the two variances also res e er a 
nce ich was not significant. «This was in Tm 

-S With the assignment of the judgment te$ 


I, that there 
d no differ- 


d the differences of 


th 
hi 


two random halves. Nevertheless, there was a 
4.48 difference between the mean M-Blank of 
groups А and B. This meant that when the key de- 
rived from group A was used in scoring group A 
and group B (giving biased and unbiased scores) of 
the judgment tests, higher scores resulted than 
when the total group was scored with a key derived 
from group B. This was thenthe reason why such 
large differences in scores resulted for the total 
group and why the analysis of variance yielded 
such a highly significant value between key A and 
key B. 

A second approach was made to illustrate that 
a still different empirical scoring key could be de- 
rived from the random halves. For this purpose 
an empirical key was derived only from group A. 
Instead of using the mean M-Blank rating for each 
option of every question as weights, the mean was 
divided by the unbiased estimate of the population 
variance of the distribution of M-Blank scores of 
those testees who checked a particular option. Val- 
idity and reliability measures were computed for 
the above scores. The 72 papers were also scored 
‘“‘rationally’’. Three professors and two advanced 
graduate students derived a judgmental or rational 
key without regard to an external criterion. The 
familiar procedure of finding the proper weights 
for the various course grades via a z-transforma- 
tion was used. 


Conclusions 


The conclusions will be drawn with reference 
to the three questions which were asked at the be- 


ginning of the investigation. 


1. Are scores on the judgment test predictive 
of success in professional courses? 


The coefficient of correlation between the judg- 
ment test scores, using a “rational” key in scor- 
ing them with the total scores in professional 
courses, was found to be . 30. This indicates that 
there is a positive rel ationship between the two 
variables of moderate size. The answer to ques- 
tion 1 is then: Scores on judgment test can be used 
for predicting success in professional courses. 
Since the coefficient of correlation is of only mod- 


erate size, a regression equation would yield pre- 
dictive values in professional courses which would! 


be relatively inexact. 


2. Are scores on the judgment test predictive 
of student-teaching grades? 


ient of correlation between judg- 
using a “rational” key in scor- 
tudent-teaching grades was cal- 
This indicated that there 

t relationship bet w een 


The coeffic 
ment test scores, 
ing them and the S 
culated to be r - 11. | 
is a slight but insignifican 
these two variables. 
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3. Are scores on the judgment test predictive 
of supervisors' rating of teachers at the end 
of a six-month period? 


The coefficient of correlation bet ween the lat- 
ter two variables was computed tober-.21. 
This value reveals a positive but s tatistically in- 
significant relationship between the two variables 
under consideration. The established correlation 
coefficient of r = . 21 is again too small to be of 
any practical predictive value. But this.r of .21 
is about as large (or as small) as the r between 
M-Blank and grade-point average. This finding 
indicates that other variables would give as good, 
or even better, predictive measures. 

If all possible response patterns to the various 
question options would have been studied for 
group A as well as group B, an explanation might 
have been possible why the ‘‘rationally”’ keyed 
judgment test scores correlated higher with the su- 
pervisory ratings than did the cross-validated 
groups. If there are k options ina question, and 
the (k - 1) option is the option ‘‘omit all items”, 
then 1+ d * (Ю+ Sis a +% patterns are pos- 


sible, where the second to the lastterm are sim- 
ply the binomial coefficients. 

While positive relations were found between 
the various variables appearing in the above three 
questions one must conclude that the relations 
were too small to be of practical predictive value. 
It is, however, possible that with the revised 
teacher judgment test higher correlation coeffi- 
cients and thus better predictive measures could 
be obtained. The revision was made by several 


rs and advanced graduate students. 
тое research is needed in this area to i 
tablish the long sought after goal to find a esci 
ure that predicts early and adequately later Fees 
ing success. Perhaps, however, the M-B е 
ratings for groups heterogeneous with respec и 
major fields and teaching location аге inheren 
unpredictable. 
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AN INVESTIGATION OF SECURITY-INSECUR- 
ITY AND ACHIEVEMENT-BOREDOM IN 
ELEMENTARY SCHOOL CHILDREN 


EARL W. KOOKER 
North Texas State College 
Denton, Texas 


Introduction 


IN THE PAST several years, the literature 
concerned with the elementary school child has 
increasingly emphasized the relationship of the 
child's personality adjustment to his behavior in 
the school situation. Indiscussions of the content 
and methods of instruction, for example, various 
Concepts such as ''feelings of insecurity” аге 
often introduced to account for differential behav- 
ior of children in a given situation. 

This study is concerne d with an attempt to 
formulate a method of measuring four adjustment 
variables which appear frequently in the litera- 
ture: “feelings of security, ээ «feelings of insecur- 
ity,” “feelings of achievement" (here used to 
mean that a child feels the activities in which he 
"s engaged are worthwhile) and feelings of bore- 


om. 
In the present investigation, the procedure of 
t was selected as the 


USing an observer's check lis 
method for defining these concepts. Itseemed to 
the investigator that this procedure tended to elim- 
mate several of the difficulties often encountered 
1n “self-rating” scales. 
d A test of the predictive usefuln є 3 
Tes developed in this study was made by investi- 
Rating the hypothesis that a relationship exists be- 
naen amount of tardiness in school and the child's 
ank on the achievement- boredom measure. 


Statement of the Problem 


ess of the meas- 


The study involved the investigation of two hy- 
" theses, One is concerned with the development 
st Procedures for identifying the four feeling 
"eye mentioned above; the second has to do wi a 
abl relation between two of these defined va 

€s and amount of tardiness in school. 


ation submitt 
n February 1951. 
and Dr. Ha 


*Thi Я 
at t is a summary of a dissert 
беу һе State University of Iowa 1 
огы шее, Dr. Ralph H. Ojemann 
the study. 


ed in partial f 
The writer wishes to thi 


rold Bechtoldt, for their gui 


The first hypothesis can be divided into three 
parts: 


1. That professionally tr ained observers will 
be able to consistently assign behavior de- 
scriptions to specified categories described 
in terms of inferred ‘‘feeling states. "' 


2. On the condition that sub-hypothesis ““1” is 
sustained, the items defined in that opera- 
tion can be scaled by the method of succes- 
sive intervals and can be utilized by an ob- 
server in a rating situation. 


3. On the condition that sub-hypothesis 669?” 18 
sustained, that the scale scores will be 
stable for different observers and for re-rat- 
ings by the same observer. 


The second hypothesis is: Children possessing 
different scores on the achievement-boredom scale 
will show differences in the incidence of tardiness 


behavior in school. 


Procedures 
BUE 


To test sub-hypothesis *1", descriptions ofa 
variety of children's school behavior patterns, 
which have been suggested as indicative of one of 
the four feeling states, were obtained from the lit- 
erature, conversations with teachers, and obser- 
vations of children in school. The behavior pat- 
terns were limited to sc hool behavior since the 
completed rating forms were to be used by an ob- 
server in the school situation. Eighty-eight such 
items were developed, each item consisting of 
a lead sentence followed by a s upplementary de- 
scription of the behavior and the situation in which 


it occurred. For example: 
The child who asks to be reassured by the teach- 


e Ph. D. to the Graduate School 
ank the co-chairmen of his 
dance inthe carrying out 


ulfillment of th 
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ning his academic work. 

= He asks what his grade is, how wel he's 

doing in his work, whether he's doing it right, 
etc. This refers to work which has been as- 
signed and complete instructions given. Even 

if he is toldthat he is doing well or satisfactor- 
ily, he continues toaskfor reassurance. This 

occurs especially after others or the teacher 

have emphasized the necessity for doing well 

in his work. 


The list of items was Submitted to six gradu- 
ate students in the departments of psychology and 
child welfare for class ification into Seven cate- 
gories, six of which were feeling states (insecur- 
ity, aggression, achievement, Security, submis- 
sion, boredom) and the Seventh a “none” category. 
The judges were asked to evaluate each item by: 


1. Checking ‘none’? if they felt the item indi- 
cated none of the States listed, 

2. Single checking all feelings which they felt 
the item indicated, 

3. Double checkin, 
the item indicated m 
listed, 

4. Triple checking a feeling if the 
indicated it more 
could recall, 


Б the feeling which they felt 
ost clearly relative to those 


y felt the item 
clearly than any such state they 


The criterion adopted for 
item for use in the second 


To test Sub-hypothesis “2” 
ability of the items selected by the judges, twenty 
graduate students in educ ation, PSychology and 
child welfare were asked to Scale the items by the 
methods of successive intervals, 1+ This meth- 
od was chosen because it is less time consuming 
than some of the other Scaling methods, yet it 
offers a test of internal consistency, 

In the list of items presented to the judges, 
each item appeared three times, each appearance 
specifying a different frequency of occurrence of 
the behavior described, The three degrees were 
represented by the phrases “frequently, uw “fairly 
often’’ and ‘‘seldom.”’ This was done in anticipa- 
tion of giving raters three choices of frequency of 
occurrence in the final rating scale. 

The Security-insecurity items were scaled on 
one continuum and the achievement-boredom on 


regarding the scale- 


*All footnotes will be found at end of article. 


another. The judges were asked to consider Y 
the frequency and type of behavior in D den 
theitem in one of eleven categories. cf 
ity-insecurity continuum, category one ер n the 
ed the highest degree of secu rity and maine 
highest degree of insecurity. On the vendes eU 
boredom continuum, category one sss 
highest degree of achievement and category e 
the highest degree of boredom. — г each 
AS a result of this step each variation-o Те: б 
item was assigned a scale value popu е 
the degree of security-insecurity or achieve 
boredom the judges felt the item indicated. — 
In the third phase of thestudy, rating-rer 5:6 
data and between-observer ratin gs of np 
Children were obtained. The rerating data ting 
obtained at intervals of two weeks. In all ra rity 
the achievement-boredom and security-insecu ere 
items were randonly presented. Theraters bg 
told not to discuss the items with one another hich 
were not informed of the feeling states on ape m 
ratings were being obtained. They were ins ency 
ed to indicate one of three degrees of frequ Да 
ОЁ оссиггепсе (“frequently, "^ “fairly oe ree 
"'Seldom"') for each item, choosing the е СА ; 
which in their opinion bestcharacterized the hools 
Data were obtained from three schools. beg 
A and C were midwest coll ege exper nds ted 
Schools. School B was a midwest consolida 
School. 12 
In April, all the children in the sixth gu 
boys and 14 girls) in School A were rate dé ob- 
rated at an interval of two weeks by an omine 
Server and the regular teacher. In pu hers 
ratings and reratings, by the regular tea hool 
Were obtained for two sixth grades in Y" girls 
(ten boys and nine girls in one group and atings 
and five boys in the other). In December yore e 
by two observers were obtained for a d y Be- 
group in School C (11boys and eight Las mm 
cause of the organization of School С it was erver 
Sary to use the present teacher as one be рге" 
апа the teacher who had tau ght the class 
Vious year and summer as the other. een-ob- 
Using the data from the second betw пев 
Server ratings of School A, items with the ted, 05 
between-observer consistency were selec h 
ing Guttman’s (3) test of item reliab Dit. ity 
items retained as having satisfactory reli e Чр" 
(an average lower estimate of . 40 and ave rak gt 
Per estimate of - 50) were used to calculate en 
rater and intra-rater reliability coeffici eiin- 
(Pearson’s r). In these calculations the re 
‘ng ratings were used. urity” 
To determine whether the continua “sec ebi- 


i ^ !wer 
insecurity” and “‘achievement-boredom”’ W 
polar in nature 


Scores and 


s curitY 
» Separate Security and E edo 
Separate achievement and bo 


"E 


—— 
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Scores were obtained for all four groups of sub- 
jects listed above. To obtain a score for each 
child, “frequently' was given a value oí one; 
"'fairly often, '' two; and ‘‘seldom,’’ three. Cor- 
relations were then calculated between security 
and insecurity scores and between achievement 
and boredom scores for thefourgroups mentioned 
above. 

The final phase of the study was concerned with 
testing the hypothesis that if a child feels bored 
with the school, he will delay as long as possible 
entering the school situation and thus will show a 
higher frequency of tardy behavior than a child 
who feels he is achieving something significant in 
school. 

This relationship was investigated by using the 
Mann-Whitney (4) technique to test the signifi- 
cance of the difference in the incidence of tardi- 
ness between pupils in the upper andlower half of 
the achievement-boredom score distribution. A 
statistical test could be made only in the group 
from School A since it was the only school which 
recorded tardiness at the beginning of each class 
period and thus the only group which had a suffi- 
cient frequency of recorded tardiness to justify a 
Statistical test. In this group the outside observ- 
er's rating was used to categorize the pupils on 
achievement-boredom. The entire years' tardi- 
ness records were employed. 


Results and Discussion 


The results will be presented in the same or- 
der as were the procedures. 

Of the eighty-eight items submitted to the 
judges, forty-nine satisfied the criterionthat four 
Of six judges must have double or triple checked 
the same feeling for that item. Twenty-one of the 
items were achievement-boredom items and 
twenty-eight were security-insecurity items. 
Twenty-four of the forty-nine items had been 
double or triple checked by four judges; nineteen 
by five judges; six by all six judges. | 

The two clinical instructors agreed with the 
Six graduate students to the following extent: 


the selected items, both 


1. In twenty-three of 
le checked the same cat- 


instructors double or trip 
Bory as the judges. 

2. In nineteen cases, one of the instructors 
double or triple checked the same category. 

3. In seven instances neither of the instructors 
double or triple checked the same category as the 
Judges, but infive of these atleast one did specify 

at some degree of the feeling was indicated. 


agree- 
name" 
though 


Thus the results indicated that some 
ment could be secured as to the “class 

ich should be associated with the items, 

€ agreement was by no means perfect. — 

The twenty-one achi evement-boredom items 


and twenty-eight security-insecurity items select- 
ed in the first phase of the study were the items 

scaled by the twenty judges. In the method of suc- 
cessive intervals, items are said to meet the test 

of internal consistency if the estimates of the in- 
terval boundary points obtained from overlapping 

discriminal processes are approximately equal. 

The standard deviation: of these estimates were 

calculated for each interval boundary point. This 

was done for both the achievement-boredom and 

security-insecurity scale. 

On the achievement-boredom scale this stand- 
ard deviation ranged from .20 for boundary point 
number eight to . 06 for boundary point number 
two with a mean and median of .14 for the ten 
boundary points. Onthe security-insecurity scale 
this standard deviation ranged from .34 for bound- 
ary point number one to . 02 for boundary point 
number nine witha meanfor the ten sigmas of . 11 
and a median of .05. The two relatively large val- 
ues of .34 for boundary point one and . 29 for 
boundary point two ac c ounted for this difference 
between mean and median values. 

Mosier's (5) shortcut method of scaling pro- 
vides a method for estimating the discriminal dis- 
persions of each item in terms of the scale units. 
In this study, eighty-three percent of the security- 
insecurity items had dispersions of less than 1. 50 
and eighty percent of the achiev ement-boredom 
items had dispersions of less than this value. 

These data indicate that it is possible to scale 
this type of item with considerable consistency. 
In interpreting these results, it should be recog- 
nized that more stable estimates of the boundary 
points and scale values might have been obtained 
if a larger number of judges had been used. Fur- 
thermore, in this investigation the lines used for 
estimating the sigmas and intercepts were fitted 
by inspection and the estimates so obtained may 
differ slightly f rom lines fitted by a method such 
as least squares. These factors were recognized 
in planning the study but it was feltthattests of the 
scaleability of the items and usefulness of such 
concepts in a relational study should be carried 
out before the additional labor of procuring more 
judges and using more precise methods of calcula- 
tion were undertaken. 

In the third phase of the study, the application 
of Guttman's (3) procedure resulted in the selec- 
tion of seventeen achievement-boredom items. 2 

The scaled scores of the thi rty-six items, se- 
lected in step three, were usedto calculate rating- 
rerating coefficients and between-observer coeffi- 
cients. These correlations for the various sam- 
ples are presented in Table I. А | 

It will be noted that the r ating-rerating coeffi- 
cients are fairly satisfactory, all approaching or 

being greater than . 90 with the exception of rater 
l's rating-rerating on the securi ty-insecurity 
scale. This may have been due to th e relatively 
short time she spent in observing the children be- 
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fore her first rating. 

The between-observer correlations are not as 
high as may be desirable. How much of an effect 
the rating conditions had on this correlation is a 
matter for further investigation. InSchool A, the 
outside observer did not have the same freedom 
to move about, hear what the children said and 
have as continuous association with the children 
as did the teacher. InSchoolC group both raters 
were not observing the children at the same time. 

On the other hand, it should be recognized that 
someof theagreement between teachers in rating 
this group could have been due to the child's rep- 
utation following him from one grade to the next. 
Though they were told nottodiscuss their ratings 
it was, of course, impossible to determine the ef- 
fects of previous discussions. 

In order to determine whether there was a sig- 
nificant difference among these reliability coeffi- 
cients, a test of homogeneity described by Rider 
(8) was employed. The test was applied to the 
rating-rerating correlations for both scales and 
to the between-observer correlations for both 
scales. In the group from School A the outside 
observer’s rating-rerating was omitted since it 
could not be considered to be independent of the 
teacher's rating of the same children. In no in- 
stance could the hypothesis of homogeneity be re- 
jected at the five percent level; therefore, mean 
correlations, using Fisher’s ‘‘Z’’ technique, were 
calculated with the following results: 


1. The mean r for rating-reratings on the S-I 


scale was . 92. І 
2. The mean r for rating-reratings оп ће A-B 


scale was . 94. 


3. The mean r for between-observer ratings 
on the S-I scale was 68. | 
4. The mean r for between-observer ratings 
on the A-B scale was .71. 
ferent 


All of these mean r's were significantly dif 
Írom zero. 

In addition to the correlation 
another factor to be considere 
the objectivity and stability of sca 
difference between mean scores. | | 
cance of the differencebetween all rating-rerating 
means and between-obs er ver means was tested 
by the use of “t” for related measures. The re- 
Sults of applying this test appears in Table II. е 

An inspection of Table II reveals that thesm 
est change in means tended to occur In the group 
from School A. Perhaps this can be partially ex- 
Plained by the fact that the raters in this group 
had more opportunity to become familiar with si 
items on the scale and with the children to be rated. 

he items were available to them for 2 longer 
Deriod of time than was true for the other raters 
—several weeks as compared to about two w i 
In School A the ratings were made in the spring 


s between ratings, 

d in investigating 

le scores is the 
The signifi- 


eeks. 


and the other groups in the fall. It will be recalled, 
too, that one of the two ratersinSchool C was not 
observing the children atthetimethe ratings were 
made. 

A longer training period in preparation for the 
ratings may bedesirable, more explicit directions 
than were given might be helpful and further study 
of item objectivity may be necessary. These re- 
sults along with those on the correlations suggest 
that the scales, intheir present form, may be use- 
ful in studies concerned with the relationship of 
scale scores to other variables but cannot reliably 
be used in normative studies. 

In the next phase of the study, the correlations 
between achievement and boredom frequency 
scores and between security and insecurity fre- 
quency scores, providedsome evidence to indicate 
that these pairs of feelings tended to represent 
continuua in the behavior of children. There was 
a tendency for children showing a high frequency 
of achievement behavior to show a low frequency 
of boredom and vice versa. A similar rel ation- 
ship was evident between thefrequency of security 
and insecurity behavior. The correlations for the 
four samples are presented in Table IH. The 
teacher's ratings were used in eachgroup. The 
four security-insecurity and thefour achievement- 
boredom correlations were tested for homogeneity 
and in neither case could the hy pothesis of homo- 
geneity be rejected atthe five percent level. The 
mean correlation for security-insecurity was -.74. 
Both of these correlations аге significantly differ- 
ent from zero. 

Next, the relationship between the security-in- 
security scaled scores and the achievement-bore- 
dom scores was investigated by correlating these 
scores from the same four ratings used above. 
These correlations are presented in Table IV. 

A mean correlation of .78 resulted when these 
four correlations were combined after being test- 
ed for homogeneity. This cor relation suggests 
that achievement-boredom and security-insecurity, 
as here defined, are rather closely related in chil- 
dren's behavior. How much of the relationship 
between the two scales is à function of ‘‘halo ef- 
fect” is not clear. Though the teachers were not 
told on what variables they were rating, it is pos- 
sible that they used some sort of '*good-bad" 
standard when making both sets of ratings. Per- 
haps using trained outside observers as raters, 
since they would not be ego-involvedinthe type of 
behavior children display in school, would throw 
further light on the usefulness of maintaining the 
two sets of traits as separate concepts. 

In the final phase of the study, and analysis of 
the relationship be tween achievement-boredom 
and incidence of tardiness revealedthat there was 
a significant difference in tardiness between those 
pupils in the upper andlower halves of the àchieve- 
ment-boredom distribution. In School A, the av- 
erage frequency of tardiness for those inthe upper 


JOURNAL OF EXPERIMENTAL EDUCATION 


338 


jueorrusts JON 


[949] 1uo218d auo Jo 
ų}uə}-əu0 ay} ye JUROTIUSIS 


Тәләт 1uo2.10d 
ouo aq) je Jurus 


Тәлә Juadsed 
әцо əy} Je JUROTFIUSIC 


suney 
IIAIISGQ-uUsIM jag 


9 
1002$ 


jueorrustg JON 


Тәләт JuaoIed 
Əuo au] уе }ивотутиЗ1с 


че21и815 JON 


]9A9[ јиәоәләй 
OA BY} je jueorjru3ts 
jueorpuStg JON 


IS 
areas 


Sureaog-Surey 
Z 191eH 


Surjessy-Surjey 
I 1924 


9 
10028 


учеотдии®ї$ JON 


JURdTIIUSIS JON 


JURITFIUBIS JON 


suney ләлләѕ 
~qO-useMj}eg 151 


3une1ou -3urjey 
Z 1ayey 


Зицеләу-ЗицеЧ 
I ләеҸ 


а-у 
IURIS 


V 
Ioouos 


o[duieg 


SATAVL ОМІ HHL NO SNV3IA H3AH3SGO-N33AL SIG 
аму SNLLVHWH-ONILVH N33AJLOG HONSMNSA3Id SEL 40 TONY DIMINDIS 


H 3'IH VIL 


KOOKER 


a 


TABLE III 


А PRODUCT-MOMENT CORRELATIONS BETWEEN SECURITY AND INSECURITY 
| FREQUENCY SCORES AND BETWEEN ACHIEVEMENT AND 
BOREDOM FREQUENCY SCORES 


Sample 

School B School B 
Variables School À School C Group 1 Group 2 
Correlated N = 26 N= 19 N-18 N=19 
Security and 
Insecurity -.82 -.72 -.55 -.17 
Achievement апа 
Boredom -. 16 -. 16 -.50 -.83 


__——————————— 


TABLE IV 


ELATIONS BETWEEN ACHIEVEMENT-BOREDOM AND 
ade SECURITY-INSECURITY 


Sample 
School B School B 
School A School C Group 1 Group 2 
oI 
85 . 47 „Ө a 19 


Correlation 


пар 
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half was 3.77 with a sigma of 5.35 and in the low- 
er half the mean was 8. 62 and the sigma was 5.75. 
This difference was significant at thetwo percent 
level as tested by the Mann-Whitney (4) technique. 

Tardiness records were alsoobtained from the 
two samples in School B, but there were too few 
incidences of tardiness recorded to justify a test 
of significance. In these schools the records 
were available only to December first and tardi- 
ness was recorded only once in the morning. In 
one group there were only three frequencies re- 
corded, all in the lower half of the achievement- 
boredom distribution. In the other group there 
was a total of fourteen recorded, twelve of which 
were in the lower half of this distribution. 


Summary and Conclusions 


This study was concerned with: (1) th e devel- 
opment of a procedure for defining and identifying 
the feeling states of security, insecurity, achieve- 
ment and boredom, and (2) an investigation of the 
relationship between achievement-boredom scores 
and tardiness behavior in school. 

The following results were obtained: 


1. Considerable agreement could be obtained 
among professionally trained judges as to 
which behavior patterns can be used to de- 
fine each of the four states under investiga- 
tion. 

2. The security-insecurity and achievement- 
boredom items selected in ‘‘1’’ satisfactor- 
ily met a test of internal consistency when 
scaled by the method of successive intervals 
on an eleven-point.continuum. 

3. The mean rating-rerating correlations for 
the scales was .93 and the mean between- 
observer correlation was . 70. Significant 
differences between rating-rerating means 
and for between-observer means indicated 
à need for a longer training period in prep- 
aration for making ratings, some modifica- 
tions of some items. 

4. Negative correlations between achievement 
and boredom frequency scores and between 
Security and insecurity scores suggested 
that these two states represented a continua 
in the behavior of children. Appreciable 
positive correlations between Security-inse- 
curity and achievement-boredom score S, 
raised some question as to usefulness of 
maintaining the two as Separate concepts. 


5. Children rated high in a feeling of achieve- 
ment were tardy less (significantly so inthe 
group in which a test could be made) than 
those designated as being bored with school. 


A list of the items comprising the security-in- 
security scale together with the scale values is 
given in Appendix A (see original dissertation). 
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The Field of Accounting 


PUBLIC ACCOUNTING is a comparatively 
young profession and is a rapidly growing one. 
From August 1940 to November 1956, the Ameri- 
can Institute of Accountants, whichis the national 
organization of public accountants, increased in 
membership f r om 5437 to 28, 535, or more than 
500 percent. It is estimated that there are now 
in the United States approximately 19,000 certi- 
fied public accountants in public practice. The 
total number of accountants, including those in 
public and private practice and government ac- 
counting, but excluding those doing routine book- 
keeping work, is estimated to be more than 200,000. 

The importance of accounting in our financial 
structure is evident to everyone. Modern busi- 
ness could not function without accountants in both 
public and private practice. Theaccountant stands 
in a unique role of trust. He is the final arbiter 
of the financial condition of many thousands of bus- 
iness organizations ranging from small private 
enterprises to giant corporations and including 
profit-making, non-profit, and governmental or- 
ganizations. It is essential that men of high abil- 
ity and unassailable integrity be attracted to, 
trained for, and retained in this profession. 


Brief History of Accounting Tests 


Objective testing was al most unknown in the 
accounting field until about fifteen years ago. In 
1943, the American Institute of Accountants, in 
Cooperation with a number of large accounting 
firms, began a measurement project as part of 
а larger program of improvement of the selection 
of personnel for public accounting. Dr. Ben Р”. 
Wood of Columbia University was appointed pro- 
ject director, and the Educational Records Bur- 
бац was designated as the operating organization 
lor the project. After about two years of explor- 
atory work in test construction and try-out, the 
testing program was placed on a service basis 
Starting in 1946 and 1947. Since that time, test 
materials and services have continuously been 
Available to colleges and to employers in public 


*All footnotes will be found at end of article. 


accounting firms and business and industrial organ- 
izations. 

From the beginning, recognition was given to 
the imperative need to emphasize the use of the 
tests in the selection and appraisal of young men 
for college training in the accounting field. In or- 
der to encourage college use of the tests, the 
charge to colleges has been set below actual cost, 
which fact has made it necessary for the program 
to be partly subsidized by the Institute. The 
materials and service charges to employers are 
more substantial. 

A recent development in the accounting testing 
program has been the provision of counseling in- 
struments for use at the high school senior level. 

A continuous research program is carriedon in 
connection with the project. More than thirty ar- 
ticles reporting research have been published by 
the project office and the Institute, and a consider- 
able amount of research onthe tests has appeared 
in other places, such as in Doctor's and Master's 
theses. 


Kinds of Tests Developed and Used 


The tests used in this project areof three kinds: 
1) tests of aptitude or orientation toward the ac- 
counting field, 2) achievement tests, and 3) meas- 
ures of interests. It has not been found possible 
to include a fourth area, personal qualities, as a 
part of the regular testing program for account- 
ants, although the i m portance of this area is rec- 
ognized, and much informal appraisal of various 
aspects of the personality of their employees is 
done by public accounting firms. 

Orientation Test— The Orientation Test, avail- 
able for use throughout all college years and in em - 
ployment situations, is a test of mental ability 
based on materials appropriate to the field of busi- 
ness. It provides a verbal score derived from a 
vocabulary subtest and a reading subtest, a quani- 
tative score based upon business arithmetic prob- 
lems, and a total score. There are three forms 
of this test, each of which requires fifty minutes 
of working time. 

In addition to the collegelevel Orientation Test, 
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an Accounting Orientation Test for High School 
Seniors has been available since 1953. It exists 
in two forty-minute forms, each of which covers 
accounting vocabulary, arithmetic reasoning, and 
simple accounting problems for whichno previous 
study is required. 

Achievement Tests— There are two levels of 
the accounting Achievement Tests. Levell, which 
exists in three forms each calling for two hours 
of working time, is planned for use with students 
who have had at least one full year of accounting 
or the equivalent. It may also be used with 
second-year students. Form A yields a total score 
based on questions іп the following areas—account 
classification, accounting vocabulary, arithmetic 
of comparative profit and loss Statements, enter- 
ing and posting, bank reconciliation, adjustment 
in ten- column worksheet, analysis of depreciation 
histories, and tracing the effect of errors. The 
othertwo forms contain subtests which are sim- 
ilar to those in Form A but not identical with 
them. Recently, three fifty-minute forms, with 
a less extensive coverage, have also been made 
available. 

The Achievement Test, Level П, is intended 
for use at the end of the senior year in collegeor 
with applicants for employment or employed ac- 
countants. It is available intwo four-hour forms 
and two two-hour forms. Form A, oneof the four- 
hour tests, contains the foll Owing parts: funda- 
mental classification relationships, entering tran- 
sactions in books of original entry, posting books of 
original entry, analysis of adjustments, analysis 
of comparative operating statements of branches, 
cash record and bank reconc iliation, analysis of de- 
preciation histories, tracing the effect of errors, 
inventory methods, influence of inventories on net 

profit, comparison of inventory methods, and au- 
diting. Thetwo-hour forms contain fewer ques- 
tions on accounting and none on auditing. Each of 
the four forms yields one overall score. Itis possi- 
ble toobtain part scores, as well, but, sincethese 
are not very reliable, no norms have been estab- 
lished for them. 


Interests — Early in the project, a Study was 
made of the Strong Vocational Interest Blank for 
Men on the basis of blanks filledout by more than 
two thousand public accountants in the United 
States and more than one thousand in Canada. The 
findings were used in developing typical profiles 
of interests on twenty-seven Occupational scales, 
including accountant and CPA, for the public ac- 
countants in each country and for different levels 
of employed accountants, such as junior, semi- 


profession, and this 
sults for all persons 
this program. 


When the Committee on Accounting Personnel 
under whose auspices the project is carried on, 
turned its attention to the provision of instruments 
for use in counseling at the high school level, the 
need for interest measurement at that level was 
recognized. However, since the Strong blank is 
not as well suited f or use with high school pupils 
as with college students and adults, it was decid- 
ed to experiment with the development of special 
profiles for public accountants on the Kuder Pref- 
erence Record-Vocational and the Kuder Prefer- 
ence Record-Personal. In 1952, both these forms 
were filled out by 578 practicing members of the 
American Institute of Accountants representing 
different levels of employment, various sizes of 
firms, a large age range, and a wide geographi- 
cal distribution. А study was made of the results 
for a group of 516 accountants who were satisfied 
with their work and a group of sixty-two account- 
ants who said that they were notsatisfied, and typ- 
ical profiles for the satisfied accountants were 
established on both preference records. These 
profiles were printed and made available so that 
high school guidance personnel could use them in 
counseling pupils about their patterns of Kuder in- 
terests as compared with the interests of account - 
ants. 


College Accounting Testing Program 


The project office provides a service program 
for the Orientation, Achievement, and interest 
tests. The program has two broad as pects—re- 
ferred to as the College Accounting Testing Pro- 
gram and the Professional Accounting Test- 
ing Program. _ 

"The college program was begun in the Cem 
ic year 1946-47, and the tests have been availab s: 
to the colleges each fall and spring since tha 
time. In 1951, testing at midyear was added to 

is program. | 
= АЙ a and Achievement Tests used n 
the college program must be returned to the T 
ject office for scoring, statistical analysis, d 
a report of the results in terms of raw scores 4 = 
percentiles basedon national norms. The EN 
for the test material and service is fifty cen Hod 
test. Bulletins summarizing the results in а 
cipating colleges, and occasionally eee a 
search studies of the tests, are issued after е 
testing program. КИГИ 

In the spring of 1956, 219 colleges аф ШЕ" 
tered a total of about fifteen thousand tests in at 
nection with this program. Since the begini 
services to the colleges, about five hundred, a 
twenty-five colleges have participated, and ap- 
total number of tests administered to date 15 
proximately 315, 000. 


Professional Accounting Testing Program 


testing pro- 
E jenta- 


The tests used in the profession On 
gram, for men outside college, are the 


м 


ty" 
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tion Test, the Level II Achievement Test, Form 
A (four hours), and Form C (two hours), and the 
Strong Vocational Interest Blankfor Men. The In- 
stitute has established forty regional offices in 
cities throughout the United States for the admin- 
istration of these tests. In addition, accounting 
firms and business and industrial organizations 
may administer the tests to members of their own 
staffs or applicants for positions, provided they 
arrange to have one of their staff members certi- 
fied by the project office to serve as examiner. 
At present, there are 217 certified examiners 
aside from those in the regional offices. 

Either local scoring or project office scoring 
may be used inthe professionalprogram. Where 
local scoring is used, the charge for the test ma- 
terials is $2. 50 per individual for the Orientation 
Test, $2.50 for the Level II Achievement Test, 
and 10 cents for the Strong blank. If the tests are 
returned to the project office for scoring, the 
charge for material and service is $5.00 for the 
Orientation Test, $5.00 for the Achievement Test, 
and $2.00 for the Strong blank, or $12.00 per in- 
dividual for the complete battery. When central 
Scoring is used, individuals may receive upto ten 
copies of their scores and percentiles punched in- 
to special IBM cards at no extra charge. These 
cards provide an official record which may be 
used in employment interviews. 

Since testing in the professional program is 
ordinarily on an individual or small-group basis, 
the volume of the professional program is a good 
deal smaller than that of the college program. Up 
to December 1, 1956, about 25, 000 tests had been 
given in the professional program. 


Program for the High School 


In the college and professional programs, the 
tests are never sold outright to users but remain 
under the control of the Committee on Accounting 
Personnel at all times. This enables the project 
office to make sure that copies of the tests are not 
left in the hands of students or others who might 
gain an unfair advantage in future administrations 
the same forms of the tests. Such precautions 
are necessary, since these tests may be used in 
screening for entrance to the study of accounting 
in college or for employment. 

In the case of the High School Accounting Ori- 
entation Test, however, it is not necessary to use 
the same kind of safeguard, sincethe uses of this 
test are not for selection but for guidance and 
counseling. So, the high school orientation test 
may be purchased by schools and kept on hand 
for use with individuals or groups as needed. The 
test is usually scored at the school, although if 
project office scoring is desired, it may be ob- 
tained at a cost of 20 cents a test, which includes 
а report of the results. Thus far, the use of the 
high school orientation test has been small, al- 


though the societies of CPA’s ina number of states 
have indicated an interestin sponsoring the use of 
this test as a guidance instrument in the high 
Schools of their states. Since July 1953, about 
8600 of these tests have been distributed for use 
in high schools. 


Norms 


Rather extensive percentile norms are avail- 
able for interpretation of the results of the account- 
ing tests. In the college program, fall, midyear, 
and spring norms have been established on the Or- 
ientation Test for one, two, three, and four years 
of accounting study. There arealso fall, midyear, 
and spring norms for one, two, andthree years of 
study on the Level I Achievement Test and for 
two years of study, three years of study, and grad- 
uating seniors on the Achievement Test, Level II. 

In the professional program, employed account- 
ant norms were set up for the Orientation Test, 
Form A, and Achievement Test, Level П, Form 
C, on the basis of a special staff testing program 
carried on by accounting firms in the spring of 1950. 
There are also some norms for employed account- 
ants on Form A of the Level II Achievement Test. 

The two forms of the High School Accounting 
Orientation Test, Forms S and T, are accompa- 
nied by percentile norms for public high school 
seniors. These are spring norms for high school 
seniors in general, without regard to course of 
study. Thus far, separate norms for commercial 
course students have not been established on this 
test. 


Reliability and Validity 


There is more information on the rel iability 
and validity of the accounting tests than can be re- 
ported in detail in this Study, but an attempt will 
be made to summarize the data. 

Reliability — Most of the rel iability data are 
Spearman- Brown odd-even correlations based on 
the scores of college students. All these are de- 
rived from results for students at a given level of 
Study. The medians of the reliabilities are ap - 
proximately as follows: Orientation Test, Ad- 
vanced Level, verbal Score, .90; quantitative 
Score, .80; total score, .91; Orientation Test, 
High School Level, vocabulary, .87, arithmetic 
reasoning, . 76; accounting problems, . 78; total 
Score .91; Achievement Test, Level 1, two-hour 
form, .94; one-hour form, .89; Achievement Test 
Level П, four-hour form, .97; two-hour form, .88. 

These reliability coefficients are about the 
Same size as those reported for tests of aptitude 
and achievement inotherfields. The total Scores 

of all these tests seem reliable enough for use in 
the appraisal, selection, and guidance of individu- 
al students, although the one for the two-hour 
form of Achievement Test, Level II, is perhaps a 
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little lower than might have been expected. Con- 
siderable time is required to obtain a reliable 

measure of achievement in the accounting field be- 
cause of the need for the examinees to do a consid- 
erable amount of reading about the test situation 

on which the questions are based. 

Validity— Evidence concerning the validity of 
the accounting tests is of three general kinds: 1) 
correlations with school or college grades, 2) cor- 
relations with CPA Examination grades, and 3) 
correlations with criteria of success in employ- 
ment. 

Coefficients of correlations bet ween the vari- 
ous tests and course grades may be summarized 
as follows: 


Orientation Test, Advanced Level, versus 
Course Grades in Nine First-Year Accounting 
Classes Distributed among Nine Institutions: med- 
ian r, verbal score, .33; quantitative score, .43; 
total score, .43. 

Orientation Test, High School Level, versus 
Grades in Bookkeeping Class in One High School, 
vocabulary, .46; arithmetic reasoning, .59, arith- 
metic problems, .49; total score, .59. 

Achievement Test, Level I, Two-Hour Form, 
versus Grades in Thirty-One First-Year Account- 
ing Classes Distributed among Seventeen Institu- 
tions, median r, total score, .59. 

Achievement Test, Level I, Two-Hour Form, 
versus Grades in Six Second-Year Classes in 
Four Institutions, median r, total score, .54. 

Achievement Test, Level П, versus Grades in 
Six Advanced Accounting Classes in Five Institu- 
tions (Senior Year), median r, total score, .54. 


All these correlations are significantly posi- 
tive, with those for the Achievement Tests tend - 
ing to run somewhat higher than those for the Or- 
ientation Tests. The correlations seem about as 
high as could be expected, in view of the rather 
low reliability of course grades and the fact that 
a variety of qualities and aspects of behavior of 
students help to determine grades assigned by in- 
Structors. In one university, it was found that 
large differences between rank in class based on 
course grades and rank derived from Achieve- 
ment Test score could usually be explained when 
the individual's background and record, including 
extra-curricular activities, were carefully 
studied. 

The second kind of criterion used, the CPA Ex- 
amination, is an important basis for state certifi- 
cation of accountants to engage in public practice. 
Nearly all states use for this purpose a series of 
essay examinations prepared by an examining 
board appointed by the American Institute of Ac- 
countants and scored at the Institute by a trained 
group of graders. The examinations are a valu- 
able criterion for use in Studying the predictive 
value of the objective accounting tests. 


Several studies have been made of the relation- 
ship between scores on the accounting tests admin- 
istered in college and grades on the CPA Examina- 
tions taken some years later. 1 

The following are medians of an extensive ser- 
ies of correlations: Orientation Test versus CPA 
Examinations, verbal score, .37; quantitative 
score, .42; total score, .46; Achievement Test, 
Level П, versus CPA Examinations, .54. These 
correlations are about the same size as those be- 
tween the tests and grades in courses studied. In 
view of the time interval between the taking of the 
tests and the CPA Examinations, the correlations 
are rather favorable to the validity of the tests. 

The ultimate criterion for appraising the worth 
of a set of professional tests is success in per- 
forming the work of the profession. a 
however, to appraise the value of tests for predic- 
tion of success on the job because the validity of 
the criteria of success is open to question. As in 
studies of many other tests, ratings of supervisors 
were the most readily available criterion of job 
success, and these were used in a number of 
studies of the accounting tests, even though rat- 
ings vary a great deal from firm to firm and from 
one supervisor to another. Inthe most comprehen- 
sive of these studies, median correlations between 
accounting tests and ratings in thirteen firms on a 
scale which included quality of work, quantity of 
work, knowledge of accounting, ability to learn, 
dependability and integrity, initiative and responsi- 
bility, cooperation, and overall value to the organ 
ization were as follows: Orientation Test verenn, 
ratings, verbal score, .29; quantitative score, {Ө 
total score, .36; Achievement Test, Level П, ve 
sus rating, .55. Perhaps because of differences 
in the care and accuracy with which the ratings- 
were done, thecorrelations varied a great deal A 
cording to the firm from whichthey were obtained. 
The median correlation between test scores ven 
ratings ran as low as . 19 in one firm and as aa 
as .74 in another. In general, scores on the ORE 
and accountant scales of the Strong blank adde 
little to the prediction based on the A c hi evemen 
and Orientation Tests. 2 

Another criterion of job success used 
cent study inone accounting firm is a salar aF 
based on increase in salary during а йуе-уё 
period.3 Correlations of test scores with salary 
index for a group of ninety -eight accountants . ; 
this firm were as follows: Achievement 7^ 
Level П, versus salary index, .33; Огіепіайо, 
Test, total score, versus salary index; · 25. 
Strong blank (CPA scale) versus salary index, • es 

The multiple correlation of Level II aoe 
ment, Orientation total, and Strong blank W! 
salary index was .40 for this group. 

These correlations indicate that the test 
are somewhat relatedtosuccess in accounting RE 
ployment Situations butthat many qualities Hu? 
than those measured by the tests also enter ! 


in are- 
y index 


scores 


It is difficult, . 
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job success. 


Some Outcomes of the Testing Program 


In summary, the contributions of this testing 
program to the field of accounting include the fol- 
lowing: 


1. Reasonably reliable and valid tests of apti- 
tude and achievement have been devised and made 
available to schools and colleges and to employers 
for use in selection and upgrading of personnel 
for the accounting field. Accounting norms have 
been established for these tests, as well as for 
two widely used inventories of vocational interests 
and one of personal preferences. 

2. Continuous services are availableto schools, 
colleges, accounting firms, and business and in- 
dustrial organizations for scoring, statistical an- 
alysis, and reporting of results of the accounting 
tests in terms of national norms. 

3. A recordof the results of the tests for each 
individual examined is maintained in a permanent 
file in the project office, where it may be drawn 
upon at any time. These scores are regarded as 
the property of the individual concerned and are 
released only upon his written authorization. 

4. Research on the values ofa variety of tests 
for prediction of success in college and employ- 
ment has been carried on and is being continued. 

5. Finally, a considerable amount of interest- 
ing and useful research data is being accumulated 
about the nature of the personnel of the account- 
ing profession. For example, equating of scores 
on the accounting Orientation Test and the ACE 
Psychological Examination and study of the ACE 
equivalents of Orientation Test medians for first- 
year students in several colleges indicate that stu- 
dents planning to enter accounting are a little 


above the national average in verbal aptitude and 
much above average in numerical aptitude. 4 In 
other words, the accounting field seems to be at- 
tracting fairly able students, although there is a 
need for further increase in the ability of those 
going into the field. 


The most extensive and dependable compari- 
sons are in the fields of interests and personal 
preferences, for it is in these areas that nation- 
ally used instruments have been applied to the ac- 
counting field. On the Strong blank, the interests 
of public accountants agree well with those of ac- 
countants and CPA's, as would be expected, and 
also with those of production managers, purchas- 
ing agents, bankers, and personnel managers. 
They do not correspond with the interests of art- 
ists, ministers, or psychologists.9 On the Kuder 
Preference Record- Vocational, the average pub- 
lic accountant is high in computational, clerical, 
and literary activities but comparatively low in 
social service, outdoor, and mechancial activi- 
ties. The Kuder Preference Record-Personal sug- 
gests that the average public accountant has some 
preference for being active in groups and likes to 
work with people and have new experiences.Ó Thus 
the typical public accountant emerges as a com- 
paratively intelligent, active, flexible, and soci- 
able individual who likes detailed computational 
work but also work in which he can give verbal ex- 
pression to his findings. 

While this testing program seems helpful to the 
accounting field, it has by no means reach ed its 
maximum usefulness. There are needs for im- 
proved tests, better norms for certain groups, 
further research on validity, and greater under- 
standing and use of these measurement devices at 
both the educational and employment levels. 
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